Last month, we sat down with Dr. Alex Sevigny to discuss data science on our #SpinSucksAMA.
Alex is a data-driven communications professional, professor, and past-director of the McMaster-Syracuse Master of Communications Program.
In our industry, it’s easy to wonder how we’re ever going to really measure…well, anything.
Do we focus on organic traffic?
Social media followers?
Shares and engagement?
Where do we even begin?
Thankfully, there are people like Alex.
He makes the idea of combining communications and data science easier to grasp, and offers a framework to understand how to analyze our metrics.
This full session was really informative.
We’re going to dig into some of the highlights.
What Constitutes Big Data?
Martin Waxman moderated the session and kicked it off by asking Alex to discuss what constitutes big data, compared to the types of data communicators deal with every day.
Alex explained:
“A big dataset is anything you really think you can’t handle by hand. When we talk about big data, there’s no strict definition.
You can look at a few factors like data volume. So, how much data is coming in to your organization every day, or how much data is being collected.
You can talk about velocity. That’s the speed at which it’s coming in. If you have a tweet stream going or a video stream going, for example, and people are reacting via Facebook or Twitter, you’re getting a lot of data at a quick velocity.
Then there’s variety. Many times big data, as we use the word now, implies you’re getting data that is very varied. It can be highly structured, such as customer feedback forms, responses to a survey, or a poll. Or you can have unstructured data…what you get if somebody is interacting with a chatbot you’ve set up to help you to reach out and be sort of a front line responder. Those data are going to be natural language and and quite difficult to parse in a structured way.
Those are the three key points—volume, velocity, and variety.”
Unstructured Vs. Semi-Structured Vs. Structured Data
In the webinar, Alex underscored the importance of knowing the difference between structured, semi-structured, and unstructured data.
Because that can affect your analysis.
Computers work best with structured data, like a spreadsheet.
Because it’s labeled and organized, computers can analyze it faster and more efficiently than people.
And often catch patterns we’re unable to see.
Semi-structured data is simply a combination of structured and unstructured data.
Email is a good example of that.
For instance, in an email, you have the address and timestamp—structured data.
And you have the body of the email.
To a computer, the text appears unstructured as it’s just letters and punctuation.
And that makes it more difficult to parse.
Unstructured data includes the words we write or say, and analyzing it requires natural language processing, where the machines learn to understand the relationship between words.
Context.
But we all use different tones and inflections when we speak.
And it’s more difficult for a computer to pick up on these.
Because what we say comes across to a computer in a waveform that isn’t divided by words.
Our brains have the capacity to make sense of the wave, pull out words, and understand the gist of a conversation.
But computers have a harder time doing this.
But they are improving.
Data Science Resources
As data science becomes an increasingly more important part of our communications toolkit, we asked Alex to share some resources.
He said online courses are a great place to start getting a handle on what data science is and does.
And while he didn’t want to favor one platform over another, he did call out Code Academy.
Code Academy has some free beginner courses available, and they have a gamified interface that’s easy to follow and understand.
They also offer higher-level courses at a price.
Do we need to become expert coders and data scientists in order to succeed in communications and PR?
Probably not, but, as Alex suggests, we do need to educate ourselves on the basics.
You can watch a shorter version of the original #SpinSucksAMA here.
More Questions?
Do you love digging into data science?
If so, we’d love to hear your thoughts on this particular Spin Sucks AMA. The comments belong to you!