> For the complete documentation index, see [llms.txt](https://docs.photoroom.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.photoroom.com/getting-started/what-were-working-on.md).

# What we're working on

Photoroom's goal is to deliver the AI image editing solution that offers the strongest possible level of product fidelity.

To achieve this goal, we have a dedicated in-house Machine Learning team that actively conducts research to determine what type of AI models are able to edit images with minimal risk of hallucinations.

Right now this team is exploring a new idea: training a model that operates directly in *pixel space*.

### What does *pixel space* mean?

Current state-of-the-art AI image editing models don't operate directly on the pixels of the image.

Instead they operate on a learned representation of the image that lives in what is called a *latent space*.

It's a fairly technical topic but here's the gist of it:&#x20;

First, a separate model compresses your image into a much smaller grid of numbers (called *tokens*). Think of it like a zip file — smaller, but with all the important information preserved.

The editing model then uses that compressed version as a reference when performing the editing.

Finally, another model decompresses it back into a full image.

But here's the important tradeoff: that compressed format is an **interpretation** of your image, not a perfect copy.&#x20;

When the model edits it and decompresses it back, it's essentially reconstructing the image from an approximation, which means it can sometimes fill in details that weren't there originally.

This is where hallucinations come from: the model confuses what was in the image with what it *expects* to be there.

<figure><img src="/files/ZTLk3Z4U0ouIVSUiAsoI" alt=""><figcaption></figcaption></figure>

To avoid this issue, the team at Machine Learning team at Photoroom is currently working on training a model that operates directly in the *pixel space*, without using any intermediary representation of the image.

If you're interested to learn more about this process, here's an article that explains how the training works:

{% embed url="<https://huggingface.co/blog/Photoroom/prx-part3>" %}

You can also try an early version of the model [here](https://huggingface.co/spaces/Photoroom/PRX-Pixel) and even download its weights [here](https://huggingface.co/Photoroom/prxpixel-t2i).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.photoroom.com/getting-started/what-were-working-on.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.