Philadelphia's Department of Records has been quietly working through one of the more unglamorous problems in municipal government: thousands of duplicate photographs, scanned documents, and digital assets clogging the city's archival systems — images uploaded twice, three times, sometimes more, under different file names and across incompatible databases that were never meant to talk to each other.
The problem matters right now because the city is midway through a broader push to modernize how residents access public records. The Philadelphia Open Data portal, which launched its current iteration in 2017, has become a central hub for everything from zoning maps to historical photographs of Broad Street and Fairmount Park. But as staff at the City Archives on Broad Street and Race Street have long known, what looks clean on the front end often masks a chaotic back end built up over nearly two decades of disconnected initiatives.
How the Mess Accumulated
The roots of the duplicate-image problem stretch back to the early 2000s, when Philadelphia began its first serious push to digitize paper records. The effort was not coordinated from a single office. The Streets Department scanned its own permit photography. The Philadelphia Water Department maintained its own image library. The Office of Property Assessment commissioned separate scanning contracts for property photos across all 67 neighborhoods. The Historical Commission kept its own repository of architectural survey images dating to the 1960s.
Each wave of digitization came with its own vendor, its own file-naming convention, and its own metadata standards — or lack thereof. When the city attempted to consolidate these holdings onto shared servers, files migrated without deduplication checks. A single photograph of a rowhouse on Kensington Avenue might exist in three folders, each labeled differently, each attached to a different internal record number. Multiply that across hundreds of thousands of assets and the problem compounds fast.
A 2023 audit by the city's Office of Innovation and Technology found that roughly 34 percent of image files stored across the city's primary document management systems were either exact duplicates or near-duplicates differing only in resolution or file format. That audit, referenced in the department's fiscal year 2024 budget justification submitted to City Council, was the first time the scale of the duplication had been formally quantified.
The Cleanup Effort and What Comes Next
The Department of Records began a structured deduplication project in the first quarter of 2025, using hash-matching software to flag identical files and manual review workflows for near-duplicates. The work is being done in phases, starting with the largest single collection — property assessment photographs, which account for more than 1.2 million individual image files tied to addresses across the city, from West Philadelphia rowhouses to warehouses in Port Richmond.
The project is being coordinated out of the City Archives facility, which moved to its current location on Broad Street after the old facility in City Hall struggled with storage limitations. Staff there are working alongside contractors from a vendor selected through the city's standard procurement process under a contract that ran through December 2025, with an option period extending into 2026.
For residents and researchers who use the Philadelphia Open Data portal or submit Right-to-Know requests through the city's online system, the practical effect of the cleanup should be faster search results and more accurate record matches. Lawyers working property transactions in neighborhoods like Fishtown and Brewerytown, journalists pulling historical documentation, and community organizations researching zoning histories near the Delaware waterfront have all, at various points, encountered the same problem: pulling a property address and receiving multiple conflicting image sets with no clear indication of which is current.
The Department of Records has not set a public completion date for the full deduplication effort, but the fiscal year 2027 budget proposal currently before City Council includes a line item for continued archival modernization work. Anyone with questions about accessing specific records in the meantime can contact the City Archives directly or submit inquiries through the city's phila.gov records portal.