As a budding field, social media research is a magnet for questions from both experts and novices alike. People are curious about the processes and methodologies used to accomplish the various aspects of the research. Some of the more common questions we field on a regular basis are as follows:
What sentiment analysis system do you use?
How do you carry out the text analysis process?
What is your method for identifying and eliminating spam?
In fact, each of these questions is one and the same. They have nothing to do with sentiment, text analysis, or spam. They have nothing to do with processes or methods or systems. In fact, they have everything to do with validity.
Validity refers to truth. Is the sentiment scored accurately? Is the text analyzed accurately? Is the spam identified accurately? Is the entire process valid? Among all the pieces of the puzzle, this is the one question that must be answered.
Unfortunately, there is no single method that automatically identifies a sentiment analysis, text analysis, or spam detection system as being the most valid one. You simply have to evaluate a large, representative sample of data and determine the answer for yourself. Are your results valid?