In this blog I will summarise the data analysis undertaken for the What Women Want 2.0 campaign.
A lot of great stuff happened on International Women’s day (7th March). Part of the chorus of voices was the What Women Want 2.0 campaign. Over the course of the campaign, over 8,000 women answered the question What do you want? This was a digital repeat of an analog campaign in 1996, which collected responses on 8,000 postcards sent in by women across the country.
I had the good fortune to get involved with the digital campaign, analysing the responses to find out who responded and what they said. Our findings, along with essays on key topics, are in the final report. In this blog, I will look at what we did to get there.
The online questionnaire for the campaign asked for some basic demographics. Other methods of data collection (i.e. where people were coming to the campaign some other way) did not necessarily collect this information, so we only had it for a sub-set of responses. Where we did have this information we found:
- The majority of respondents are white (87%)
- Respondents are of all ages, but the most prevalent were the under-40s. Just under half of respondents are between 18 and 40
- Respondents are overwhelmingly from the UK
- There’s an even split between mothers and those without children
What did they say?
We started off with basic text analysis, looking at the frequency of words used. The word cloud below shows the relative use of words and phrases with stop words removed. This was done in Python using Andreas Mueller’s word cloud library.
We then moved onto looking at the context in which words were used, by searching for particular collections of words in each response. Our steps:
- Another volunteer used proprietary software to do this cluster analysis, giving an indication of which words most commonly appeared with which other words
- The team used these clusters to create word groups (using word stems, e.g. educat for education, educate, educated)
- Python and I then came back into the picture. I assessed what proportion of responses contained these word groups.
The results are in the report, of which screenshots below:
So that’s it. Questions, comments, etc. get in touch