Skip to main content
The Daily Philadelphia

All of Philadelphia, every day

News

City Archivists and Tech Experts Weigh In on Philadelphia's Duplicate Image Problem — and What It's Costing Taxpayers

Officials and digital preservation specialists are calling for urgent action as duplicated photographs clog the city's public records systems and drive up storage costs.

Share

By Philadelphia News Desk · Published 4 July 2026, 3:45 PM

4 min read

Updated 3 h ago· 4 July 2026, 11:47 PM

How we reported this

This article was generated by AI from the linked public sources. The Daily Philadelphia is independently owned and covers Philadelphia news free from advertiser or sponsor influence. Read our editorial standards →

City Archivists and Tech Experts Weigh In on Philadelphia's Duplicate Image Problem — and What It's Costing Taxpayers
Photo: Photo by K on Pexels

Philadelphia's municipal digital archive has a clutter problem. City archivists, IT administrators, and open-government advocates are raising alarms about tens of thousands of duplicate images sitting inside the city's document management systems — redundant files that consume server space, slow retrieval times, and complicate public records requests filed under Pennsylvania's Right-to-Know Law.

The issue has gained traction in recent weeks as the Philadelphia Office of Innovation and Technology has begun an internal audit of its enterprise content management platform, which stores everything from permit photographs taken by the Department of Licenses and Inspections to crime scene imagery routed through the Police Department's digital evidence repositories. Duplicate images accumulate when files are uploaded multiple times without automated deduplication checks — a mundane but expensive technical failure that several major cities addressed years ago.

Why This Is Surfacing Now

The timing matters. Philadelphia is in the middle of a multi-year push to digitize records held at the City Archives on Broad Street, a project tied to a broader modernization effort that the Managing Director's Office has been coordinating since at least 2024. Digitization contracts are priced partly by storage volume, meaning duplicate files directly inflate what the city pays vendors. Storage costs for cloud-hosted municipal data have climbed sharply across American cities over the past three years, with per-terabyte rates from major providers rising as demand outpaces infrastructure investment.

Temple University's Department of Library and Information Science, based in North Philadelphia, has produced research on institutional image deduplication workflows that city staff have reportedly consulted. Faculty there have described the problem as common among large municipalities that digitized rapidly during the COVID-19 pandemic without building back-end quality controls. The Philadelphia-based nonprofit OpenDataPhilly, which aggregates and publishes city data sets, has also flagged inconsistencies in image-linked data that could partly trace back to duplicated source files.

Staff at the Free Library of Philadelphia, whose digital collections team operates out of the Parkway Central branch on Vine Street, have dealt with analogous challenges in managing photographic collections. Librarians there have described deduplication as labor-intensive without the right software tooling — a sentiment echoed by records managers in other city departments.

What Experts and Officials Want Done

Digital preservation consultants who work with Pennsylvania government agencies point to three practical interventions: deploying perceptual hashing software to flag near-duplicate images before ingestion, establishing clear file-naming conventions enforced at the point of upload, and conducting a one-time retrospective clean-up of existing repositories. All three steps are well within the technical capacity of a city Philadelphia's size, specialists say, but require budget allocation and inter-departmental coordination that has historically been difficult to achieve.

The Office of Innovation and Technology has not yet published findings from its current audit. A spokesperson for the office confirmed the review is ongoing but declined to provide a timeline for completion or a preliminary estimate of how many duplicate files have been identified. The city's IT budget for fiscal year 2026, which began July 1, allocates funding for infrastructure modernization broadly, but line-item detail on archive deduplication efforts has not been made public.

Council members representing districts that include major city facilities — including those covering Center City and Kensington, where L&I inspection activity generates large volumes of photographic evidence — have not yet scheduled hearings on the matter, though government watchdog groups including the Committee of Seventy have noted that records management quality directly affects the public's ability to exercise Right-to-Know rights efficiently.

For residents and journalists who rely on public records, the practical advice is straightforward: if a records request produces an unusually large batch of photographic files, it is worth asking L&I or the relevant department whether deduplication was applied before the release. And for city hall, the clock is ticking — every month the audit drags on is another month of unnecessary storage bills landing on Philadelphia taxpayers.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Philadelphia

Covering news in Philadelphia. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Philadelphia news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Philadelphia and accept our Privacy Policy. Unsubscribe anytime.