What Are You Doing Outside the Kitchen?

In November 2011, I ran four sentences through Google Translate. English to Hebrew.

The sentences:

  • I wash the car
  • I wash the floor
  • I wash the kitchen
  • I go shopping

Hebrew is a gendered language. Every verb has a masculine and a feminine form. The translator had to pick one.

It picked masculine for the car. Feminine for the floor. Feminine for the kitchen. Masculine for shopping.

The subject is "I" in all four sentences. The subject has no gender. The only thing that changed was the object.

I posted it to Facebook with the title מה את עושה מחוץ למטבח? The Hebrew idiom for "what are you doing outside the kitchen?". It's the kind of thing a certain kind of man says to a woman who has opinions.

Friends were quick to name it. Statistical sexism. And not just the kitchen. The floor gets the feminine treatment too. Both are inside the house.

Ten years later, I ran the same sentences again.

אוטו had become מכונית. A more formal word for car. Everything else was identical.

They updated the vocabulary. Not the bias.

The interface had changed completely. Neural machine translation had replaced the old statistical models. Transformers had transformed everything. The bias had not moved.

This is what people mean when they say models reflect their training data. It doesn't mean the model is broken. It means the model learned from text written by humans, and humans have assumptions baked into the words they use, the sentences they construct, the contexts where certain verbs appear.

The car is washed by a man because most of the sentences the model learned from about car washing were written by or about men. Same logic, opposite direction, for kitchens and floors.

Nobody programmed the model to be sexist. It observed the world as it was written down and reported back faithfully.

That's the uncomfortable part. The model isn't wrong. It's accurate. A mirror held up to the language we actually use, not the world we say we want. That's what society wrote down.