Starting a PhD program or any research project for that matter, one of the first things that you have to do is the literature review. When I first started carrying out the review, I found searching literature and organizing the readings to be excruciating. Where do you begin? In what order should you read your articles? Where do you stop reading? After delving into bibliometrics, I found that using the tools are really helpful to make the literature review less painstaking and more efficient. In this post, I will just list my ideas on how various bibliometric techniques can aid in this task.
One of the first things one has to do is to download the literature. Many researchers would carry this out by using Google scholar and search the keywords that they are familiar with. The problem however with this process is that especially for beginning researchers, they would not know all the relevant keywords in the first place and thus, exclude a lot of important papers. For more advanced researchers, they can resort to the Web of Science or Scopus and apply various Boolean operators to narrow or widen their search. But still, the problem persists, how can you ensure that you have not excluded valuable articles that are not using the keywords in their title, abstract or author-identified keywords.
Bibliometrics has an approach that can be helpful. To ensure that your collection of articles will be comprehensive, you can grow that collection from a seed of articles. To do so, you first download a set of articles through keywords that you are sure are related to your topic of interest. After downloading data from these set of articles, you can grow this set by downloading their frequently cited articles. One can set a minimum threshold of citations an article should have before it is downloaded. This can easily be done through software like CitNetExplorer, which exports the DOI.
Extending this further, another step one can do is download the citing articles. This is especially helpful for fields where advances are constantly occurring, making it difficult to track the keywords being used. This also allows one to identify the adjacent fields that the original field is extending to. This step can easily be done through the citation report feature of the Web of Science. As a caveat though, one should set a threshold on how many citations a paper should have in the original dataset before it is added to ensure that all the papers are still relevant. This can be done in the absolute or relative. For instance, one should consider that a paper cites 5 papers from the original dataset or at least 30% of its citations are from this. One should also consider the journal and category the article belongs to.
Organizing your Papers
Having downloaded the papers, it is now important to organize them by topics. To help with identifying the subtopics within your main topic, you can create a rough cooccurrence map of the keywords. This can be carried out through software like VosViewer. This shows you the different keywords used in your literature and how related they are with each other.
A more direct way of organizing the papers is by plotting the bibliographic coupling network of the publications. This plot shows paper according to how they are related to each other based on the references they share.
Now that you have to organized your papers, there are many ways to read them according to your preference. I propose to subdivide them by core papers and current papers. You can then read the core papers first to contextualize the foundations of the field. These core papers are identified by high citation count within your set of papers. On the other hand, the current papers show the current trends in the field. These are identified by looking at the latest publications in the top journals in your field. This journals can be identified by combining measures of citations, number of relevant articles and relatedness of keywords.
To carry out the actual literature review, everyone has their own system. I fortunately have found something that works for me. It involves combining Microsoft Access with a qualitative data analysis software like Atlas.Ti. I plan to share my system in the coming weeks.
NOTE: This is draft#1 and is still under revision.