川普上台後大力打擊多元議題的題材,尤其是聯邦政府單位的文件中有提到多元議題的文件,都被行政命令要求下架,像是 CDC 就被下架大量的研究文件 (包括論文):「CDC webpages go dark as Trump targets public health information」。
所以也看到 Archive Team 在「搶救」這些文件:
US Government: Archiving the US government. IRC Channel #UncleSamsArchive (on hackint)
哈佛法學院前天公佈了他們手上有 archive (data.gov 的部分,不過這應該是蠻完整的,聯邦政府的公開資料基本上在這邊都可以找到),因為他們從 2024 年開始就一直常態在備份了:「Announcing the Data.gov Archive」。
Today we released our archive of data.gov on Source Cooperative. The 16TB collection includes over 311,000 datasets harvested during 2024 and 2025, a complete archive of federal public datasets linked by data.gov. It will be updated daily as new datasets are added to data.gov.
另外他們也把備份的工具公開出來,是一堆 Python 寫的 script:
In addition to the data collection, we are releasing open source software and documentation for replicating our work and creating similar repositories. With these tools, we aim not only to preserve knowledge ourselves but also to empower others to save and access the data that matters to them.
另外在專案頁上也有說明從 2025/02/06 開始會更密集的更新這份 archive:
Files in this repository were collected intermittently between 2024-11-19 and 2025-02-06.
Beginning on 2025-02-06, we will update the repository daily.
這算是解決了急迫性的問題,後續可以再花時間利用這些資料...