Skip to main content
The Daily Philadelphia

All of Philadelphia, every day

News

How Philadelphia's Public Record Archives Ended Up Riddled With Duplicate Images — And What the City Is Doing About It

Years of piecemeal digitization projects, competing contractors, and siloed city departments left Philadelphia's official image databases bloated, inconsistent, and increasingly unusable.

Share

By Philadelphia News Desk · Published 4 July 2026, 2:51 PM

4 min read

Updated 4 h ago· 4 July 2026, 11:12 PM

How we reported this

This article was generated by AI from the linked public sources. The Daily Philadelphia is independently owned and covers Philadelphia news free from advertiser or sponsor influence. Read our editorial standards →

How Philadelphia's Public Record Archives Ended Up Riddled With Duplicate Images — And What the City Is Doing About It
Photo: Photo by K on Pexels

Philadelphia's Department of Records has been quietly working through one of the more unglamorous problems in municipal government: thousands of duplicate photographs, scanned documents, and digital assets clogging the city's archival systems — images uploaded twice, three times, sometimes more, under different file names and across incompatible databases that were never meant to talk to each other.

The problem matters right now because the city is midway through a broader push to modernize how residents access public records. The Philadelphia Open Data portal, which launched its current iteration in 2017, has become a central hub for everything from zoning maps to historical photographs of Broad Street and Fairmount Park. But as staff at the City Archives on Broad Street and Race Street have long known, what looks clean on the front end often masks a chaotic back end built up over nearly two decades of disconnected initiatives.

How the Mess Accumulated

The roots of the duplicate-image problem stretch back to the early 2000s, when Philadelphia began its first serious push to digitize paper records. The effort was not coordinated from a single office. The Streets Department scanned its own permit photography. The Philadelphia Water Department maintained its own image library. The Office of Property Assessment commissioned separate scanning contracts for property photos across all 67 neighborhoods. The Historical Commission kept its own repository of architectural survey images dating to the 1960s.

Each wave of digitization came with its own vendor, its own file-naming convention, and its own metadata standards — or lack thereof. When the city attempted to consolidate these holdings onto shared servers, files migrated without deduplication checks. A single photograph of a rowhouse on Kensington Avenue might exist in three folders, each labeled differently, each attached to a different internal record number. Multiply that across hundreds of thousands of assets and the problem compounds fast.

A 2023 audit by the city's Office of Innovation and Technology found that roughly 34 percent of image files stored across the city's primary document management systems were either exact duplicates or near-duplicates differing only in resolution or file format. That audit, referenced in the department's fiscal year 2024 budget justification submitted to City Council, was the first time the scale of the duplication had been formally quantified.

The Cleanup Effort and What Comes Next

The Department of Records began a structured deduplication project in the first quarter of 2025, using hash-matching software to flag identical files and manual review workflows for near-duplicates. The work is being done in phases, starting with the largest single collection — property assessment photographs, which account for more than 1.2 million individual image files tied to addresses across the city, from West Philadelphia rowhouses to warehouses in Port Richmond.

The project is being coordinated out of the City Archives facility, which moved to its current location on Broad Street after the old facility in City Hall struggled with storage limitations. Staff there are working alongside contractors from a vendor selected through the city's standard procurement process under a contract that ran through December 2025, with an option period extending into 2026.

For residents and researchers who use the Philadelphia Open Data portal or submit Right-to-Know requests through the city's online system, the practical effect of the cleanup should be faster search results and more accurate record matches. Lawyers working property transactions in neighborhoods like Fishtown and Brewerytown, journalists pulling historical documentation, and community organizations researching zoning histories near the Delaware waterfront have all, at various points, encountered the same problem: pulling a property address and receiving multiple conflicting image sets with no clear indication of which is current.

The Department of Records has not set a public completion date for the full deduplication effort, but the fiscal year 2027 budget proposal currently before City Council includes a line item for continued archival modernization work. Anyone with questions about accessing specific records in the meantime can contact the City Archives directly or submit inquiries through the city's phila.gov records portal.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Philadelphia

Covering news in Philadelphia. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Philadelphia news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Philadelphia and accept our Privacy Policy. Unsubscribe anytime.