An eye-catching picture at the head of a blog post. It’s the done thing. But where do these pictures come from? Well, like pretty much everything else on the Internet, some are originals, created by the author or an accomplice, whilst others are simply copies of pre-existing online content. Sometimes the copies are re-posted with the creator’s permission, and sometimes they’re just stolen. This image is an original, taken by the author of the post, but would you have known or cared if it wasn’t?
If not, why should you care what’s original and what’s copied? Well, it can be pretty embarrassing to make a huge fuss of an individual’s contribution to the Internet, only to find later that they stole it all from someone else. If you run round praising a Web user as a productive genius, but it turns out they’re actually just a thief, you’re likely to feel stupid, and stupid mistakes never really leave you – especially when you’ve made them in public. So knowing which content is original and which is copied is in everyone’s interests.
The most prolific havens of rehashed content are social media sites, which are primarily driven not by creatives, but by a full cross-section of the public, who want attention, but don’t necessarily have any inherent means of getting it. They’re probably limited for time, they may not be that creatively inclined, and in an environment where the consequences for appropriating other people’s content are virtually non-existent, many see no problem with using secondhand material as if it were their own.
Twitter encourages unattributed re-posting from outside sources, because the format is so chronically limited in space. There’s barely room in a Tweet for a sentence of text and a couple of tags, let alone an attribution to the content source. This, combined with the site’s ‘treadmill’ nature (you have to Tweet constantly to stay on the radar), has helped to create a culture of – to put it politely – endemic and endless ‘content recycling’.
Even within the walls of Twitter itself, millions of users re-post other people’s Tweets as their own. And this is not down to an accident, an oversight or a lack of attribution space. Twitter has a built-in, one-click means of passing on other people’s Tweets with full attribution. It’s called the Retweet button. But many users deliberately bypass that most simple of options. They see a Tweet they find entertaining, or funny or whatever, and they obviously know they can click the Retweet button. Instead, though, they think…
“Hmmm, if I Retweet that, someone else is going to get the credit. But if I copy and paste it into a Tweet of my own, then I will get the credit.”
The self-serving human instinct takes over and you see something like this…
It’s clear from the formatting and punctuation of the Tweets that this is no case of “great minds think alike”. It’s evidently an endless list of people copying and pasting a pre-existing piece of content, as is. The same happens with pictures.
The particular work the above Twitter users are repeating (let’s call it the “Salad tastes good” joke), makes for an interesting case. Setting aside the fact that many would argue salad actually doesn’t taste good, so the joke isn’t particularly strong, the audit trail of attribution has long since vanished. We therefore face a considerable amount of detective work if we want to find out who is actually responsible for that gag.
MAKING A START
Whilst Twitter is bad for encouraging unattributed mass redistribution, one thing it is good at is preserving the precise dates and times when Tweets were posted. Nothing can be edited or backdated, so if you post something on Twitter before someone else, you can always prove you were first.
This rigid timelining within Twitter has been seized upon by digital guru Amit Agarwal, who came up with a free app which allows you to find the first occurrence, on Twitter, of a given phrase. If you’ve seen a text contribution on the world’s biggest micro-blogging site, the Who Said It First? app is a great place to start the detective challenge of establishing who said what, first. You can find it via the link below…
It can take some time to process your search, but the app does a thorough job, and undeniably makes a valiant attempt to find the very first iteration of a phrase on Twitter.
The problem, however, is that…
a) quotes and one-line jokes often come from outside of Twitter, so the first person to Tweet the phrase may not the be the creator, and…
b) some users change bits of the wording either to fit Twitter’s character limit, or to try and improve the phrase, or because they want to try and transform the content sufficiently to impart their own intellectual property upon it. This can (but doesn’t always) fool the app.
In this case, due to changes in the wording, the Who Said It First? app didn’t find the earliest instance of the “Salad tastes good” gag on Twitter. I was able to find several earlier examples with different wording using Topsy.
Topsy is a very powerful Twitter search resource because it stores extant Tweets going right back to the very first one (March 2006), and its facilities allow reverse ordering, as well as the breaking up of long phrases into smaller components. The earliest example of the “Salad tastes good” joke I could find through Topsy was from 18th August 2011. It’s split across two Tweets, but here’s the punchline…
she said "Tell me something I don't know" i replied, with a tear in my eye, Salad tastes nice—
Ali Bla Bla (@aliblabla) August 18, 2011
NOW GOOGLE IT
Twitter, however, is not the universe, so the next, obvious port of call is Google. Google allows you to set a date range for your search (as explained in How To Find Everything On The Internet), which means you can cut out all dates later than the earliest instance you’ve already found.
Theoretically, with a date range set as required, you should only see older instances of a phrase than your current oldest. But unfortunately, Google can get confused by people’s deliberate backdating of posts, or posts which don’t show a publication date, and it can end up displaying a line of ‘false positives’. For the “Salad tastes good” joke, various sites appeared in Google’s results with much earlier dates than August 2011, but some of those dates pre-dated the launch of the actual sites, so clearly they were false positives.
I did, however, find that the joke was publicised on the UGC (user generated content) site Sickipedia on 18th August 2011 – the same day as the first example I could find on Twitter, but at an earlier time.
I was unable to find anything genuinely posted earlier than the Sickipedia entry, so it looks to me as through that was the first widely publicised source on the Internet.
Why do I type widely publicised in italics? Well, because the nature of the Internet means that obscure references, from less influential users, may not have been picked up by the search engines. It’s also the case that someone could previously have coined this phrase on a private social media account (which can’t be indexed by Google), or someone could have posted it on a public blog or social media site, then closed their account, taking an earlier iteration offline in the process.
Then, of course, you have to consider that the original example may not even have come from the Internet. Jokes can come out of old joke books, from television programmes, from live stand-up routines in bars, etc.
The Sickipedia user who first added the joke does have a substantial duplicates total on their stats page, so it wouldn’t be a great surprise to learn that the joke did pre-exist somewhere.
WHAT’S THE POINT OF THIS EXERCISE IF THE RESULT IS INCONCLUSIVE?
The results won’t always be inconclusive, but even when they are, the point is the journey itself. In taking a voyage of discovery into where social media content originates, you find A HELL OF A LOT of people tacitly claiming ownership of ideas which really aren’t their own. It opens your eyes to how staggeringly derivative human behaviour can be – not to mention sly.
Doing this kind of retrospective search for the first time, it can be quite shocking to see just how many of the influential users you thought were original, actually show up as copycats – often just lazily lifting the creations and ideas of others word for word. Google cites about 123,000 instances of the salad joke in its index, and there’s bound to be more it hasn’t picked up.
It’s also interesting the way the format of the content tends to vary early on in the redistribution progression, with people using different wording. Then, once the phrase gets a bigger audience and goes viral via an influential user, the wording and format becomes rigid. Everyone just copies and pastes it from that point forward, until the next influential user changes the wording – then that becomes the new format.
There’s an interesting turnaround of the gag very early on in its life on Twitter, where a user makes himself into the butt of the joke…
This woman in the shop comes up to me and says oooh you are a big lad I said tell me something I don't know She said salad tastes nice—
daniel aj knott (@ajknott) August 19, 2011
That, for me, was sufficiently transformative to be valid as an addition to the pre-existing joke. It made the user look a lot less nasty too.
It’s difficult, with a site like Twitter, to determine the rights of people ‘recycling’ content. There’s a precedent in the offline world in which people tell jokes, and most of their peers will understand they didn’t create those jokes themselves – even though they don’t literally say: “I didn’t write this joke, and here’s who did“. It’s the same when friends show each other pictures in copyright-controlled books. That, to an extent, can be seen to exonerate the unattributed re-posting of short-form content.
But the Internet is a publication medium, and that means people’s work is being replicated not just on a one-to one level (as with a couple of mates in a pub), but potentially to a very large audience. That changes the dynamics. For Twitter users with big followings, tweeting is clearly publishing, and not personal chat. That makes the use of secondhand content far less acceptable in my opinion. Does anyone have the right to mass redistribute someone else’s intellectual property without Fair Use transformation, permission or attribution?
When you can and do create your own content, your bias is bound to be against ‘recycling’. But if you can’t create your own material, things are going to look different. I’ve found my photos re-posted without permission or attribution on Twitter and elsewhere, and when the re-poster’s intention is just to share a point of interest with a few friends, I’m not going to be negative about it. But I do have a very different attitude to those who calculatingly set out to use my work and the work of others to achieve success, profile-build, and make money. Those people are con artists, and if you support them, it’s not me who’s being conned and taken for an idiot – it’s you.