Datafying Text Exposes Censorship of the Past
We are seeing actually words becoming datafied. And by doing that we could learn things that we never could before. So an example of this is take text on a page. If we were to scan it into a scanner and then send it to each other digitally, we will have simply digitized the page and that’s going to be simply an image file of it. The words aren’t really data, they’re just simply sent in a numeric form if you will, in a digital form. That’s digitization. When we datafy it we’re going to take the text and we’re going to be able to take each individual letter and take it each individual word and understand it in context with everything else.
We’re gonna just treat it as data. So what researchers have done is they had looked at the name of certain artists overtime to see how their popularity grows and changes. Now doing that they were able to find some very interesting patterns. It turns out that there were some artists who were very popular at the beginning of the 20th century and then just sort of disappeared. And then they became popular towards the middle of the 20th century. And what they found was censorship.
What they’re able to identify by datafing text is that references to the one painter Marc Chagall, in the German language, went dark between 1933 and 1945. He was Jewish. The world leaves a trace. And the trace is data. In the past we could never collect it before and never could understand it before and now we can.
In Their Own Words is recorded in Big Think’s studio.
Image courtesy of Shutterstock