Reader

Refactoring a legacy codebase with a god Repository and incomplete Clean Architecture

| Software Engineering Stack Exchange | Default

I'm currently working on a large legacy project that tried to implement Clean Architecture combined with MVVM, but unfortunately didn't fully adhere to the principles.

One major problem: The repository layer simply forwards raw data from the data source (network api) directly to the ViewModel. The transformation and formatting of the data is done in the ViewModel itself — often using private helper methods. This makes the ViewModel hard to test and violates separation of concerns.

We’re planning a refactoring, and we want to move the transformation logic elsewhere. However, we’re facing a challenge:

There is one huge, heavily-used repository class "God Repository" that is referenced in a large number of places throughout the codebase.

Although we have multiple repositories, this one in particular has grown into a superclass-like dependency, and it's used across many features. Mainly because it's pointing to our main network API and delivers us data for most features.

Data transformation logic varies across different consumers (ViewModels), so there’s no single, consistent transformation.

Ultimately, we want to cleanly separate transformation logic and reduce the responsibility and surface area of this repository.

We're currently considering these possible strategies:

  • Introduce Use Cases that act as intermediaries between ViewModel and Repository. These Use Cases would handle all business logic and transformation, so eventually, the God Repository could be phased out or minimized.

  • Introduce a Domain Layer with entities and services responsible for transforming the raw data. This could involve patterns like Mappers, Adapters, or Domain Services.

  • Refactor the God Repository into smaller, more modular repositories over time — each scoped to a specific feature or domain area

Our main goal should be to refactor piece by piece, so we don't have to face issues when changing to much of the codebase.

So now I’m wondering:

  1. How would you approach refactoring a God Repository that’s tightly coupled across many parts of a legacy codebase?
  2. Are there any design patterns or techniques (e.g., Facade, Strategy, Visitor, etc.) that you’ve found useful for cleanly handling this kind of transformation?

Edit:

For some clarification a small example:

Let’s say we have a large app where user data is fetched from a remote server. We have a Repository that is used in most ViewModel's. It connects directly to the network layer and simply returns raw data (a UserDto object from an API). Now imagine one ViewModel wants to show the user's full name in uppercase. It fetches the raw data from the repository and then formats the name manually — inside the ViewModel. Another ViewModel does something different, like displaying just the first name with a greeting. Again, it formats that directly inside the ViewModel.

Some context about the project:

It's an Android application, written in Kotlin with some legacy Java code. It targets both mobile and TV platforms.

In the Android development world, it's quite common to use a Clean Architecture + MVVM pattern, since Android provides many components that align well with this architecture, especially to deal with the complex lifecycle of mobile apps.

In MVVM, the ViewModels are primarily responsible for holding UI state, so they belong to the UI layer. This means they shouldn't be formatting or transforming raw data — that's a responsibility of other layers like the domain or data layer.


Ideally, the data would be transformed in the repository (or in a Use Case), and only then passed to the ViewModel. However, in our current setup, that would require extensive changes across the entire project. The reason is: we have a "God Repository" that is referenced in many ViewModels throughout the app.

We’re currently looking for a way to refactor this step by step — breaking down the God Repository and introducing proper data transformation logic in a cleaner and more modular way.

Hope this clarifies the issue!