The outbreak of Coronavirus also named COVID-19 in mid-December of 2019 has spread far beyond its origin in the city of Wuhan in central China and has taken the lives of approximately 2000 citizens epicentering China but fatalities encumbering in other parts of the globe as well. The virus bringing the world at caution and global trade and economy at a standstill, has deemed its position as an international crisis. Its global impact has led to criticism of the Chinese government and the way it handled the outbreak of the virus specifically in the early stages. The Chinese government has been accused by international media and other health organisations of undermining the seriousness of this virus and more importantly its communicability. This allowed an apathetic effort in management in the early stages and of what turned from a regional to a global crisis. This has developed a debate as to the withhold of the Chinese government on information regarding international attention. To contribute to this discourse, we intend to look into the flow of information online under the infamously strict internet laws that are in play in China. We would like to analyse whether there is any variation in important public information on Chinese based media sites and how they are published as compared to information shared on the global media landscape.
The intent behind this research stems from the significance of not only the virus but also how information of not only domestic but also international concern is hindered with. China is the second largest economy in the world and its role on a global level on all grounds is immense. It also has one of the strictest media and internet laws and a brief look into them shows the grasp that the government has over it. We would like to understand how these various factors regarding the large country play into the larger international and for a vast majority more contrasting media and social organisations.
The methods through which we intend to carry out this research are a mix of data scraping, categorizing and tagging complemented with a content analysis. We will use the search engines Baidu and Google as platforms from which we will be sourcing our data. Using the tool Data miner, we will scrape the search results for the key term “coronavirus” on both platforms. We will then categorise the data in terms of the source and the reputation of the organisation. We will also do a content analysis and understand the nature of the articles to gain a better understanding of these results to make more clear distinctions. We then will conduct a comparative analysis of data from Baidu and Google in the hope of drawing some conclusion that would contribute to a wider discourse of the importance of media democracy.
The rationale behind our approach was looking into the primary sources of where information is spread. In China, Baidu is the largest search engine and is the second largest search engine globally. It was the logical choice to use as a platform for our research especially regarding the data that needed to be collected in the Chinese context. The global iteration of Baidu is Google and where most of the world looks to for their information. In contrast to Google’s brand of information that is accessible to everyone, Baidu must comply with the strict internet laws of China. We are using the tool Data miner cause unlike many tools it is not platform specific and can be used for both Google and Baidu. It makes the task of extracting data easier and is also familiar to us. We are using a variety of methods while working with the data. This is specifically to make sure that this data is synthesised in to something that is comprehensible to a mass audience and that we can draw feasible conclusions that we are able to contribute to not only to this particular debate regarding coronavirus but also on the wider debate of how freedom of information.
We have begun our research by doing some preliminary data extraction with the help of data miner. We set up some parameters for our test results. They were setting up the time period from 15th of December to 15th of January. We then scraped the search results for the term coronavirus on our two elected platforms that are Baidu and Google. We can notice some tendencies among the data that we have extracted from our test results. What is evident at least from some surface research is that Baidu is more inclined towards Chinese originating sources and organisations both public and private that are under the arbitrary government in China. The results are quite contrary to search results on the platform Google. Google search results include more renowned sources such as the World Health Organisation, The National Health Service UK, the National institute of Health and Environment, The Netherlands and other news organisations like The New York Times and the Guardian.
1. The search results for Google using Data Miner 2. The results for Baidu using Data Miner
3. The Google results extracted by Data Miner
The preliminary search results give us a good ground for our research. The first look at the data gives us an account of what we can establish and make use of in terms of drawing conclusions and the steps we need to take to do so to conduct a promising research. We will now attempt to get more detailed data and configure the parameters to get data that is most relevant to our hypothesis. We will take the search results and observe patterns and visualise them in terms of the most referenced website and organisation. We will then do an analysis of the nature of the content to support the variation in information that is available on each search engine.