Using Python for Bibliometric Analysis: Demo on Science Entrepreneurship

I needed to familiarize myself with the literature on science entrepreneurship (for reasons I’m going to explain soon). After delving into bibliometrics and doing literature review repetitively for my PhD, I already have a system to efficiently introduce myself to a new literature. In this post, I will explain my process, hoping it helps others who are also entering a new field.

I typically follow these steps:

  1. Explore the Web of Knowledge using a keyword search
  2. Explore data in Python
  3. Create visualizations using VosViewer

The first step for me is usually just trying out different keywords in the Web of Knowledge. I then browse the first page of the latest articles and the top cited articles. I try to check whether these are related to my topic of interest.

For this topic of science entrepreneurship, I settled with the following keywords. I also narrowed it down to the management journals that I know are relevant to technology and innovation management and just general management. Moreover, I was just interested in the papers published from 2010. Below was my keyword search:

TS=(science OR technology ) AND TS=(startup* OR “start up” OR “new venture” OR entrepreneur* OR “new firm” OR “spin off” OR spinoff* OR SME OR SMEs) AND SO=(RESEARCH POLICY OR R D MANAGEMENT OR STRATEGIC MANAGEMENT JOURNAL OR JOURNAL OF PRODUCT INNOVATION MANAGEMENT OR ACADEMY OF MANAGEMENT REVIEW OR ACADEMY OF MANAGEMENT JOURNAL OR TECHNOVATION OR SCIENTOMETRICS OR TECHNOLOGICAL FORECASTING “AND” SOCIAL CHANGE OR TECHNOLOGY ANALYSIS STRATEGIC MANAGEMENT OR ORGANIZATION SCIENCE OR ADMINISTRATIVE SCIENCE QUARTERLY OR JOURNAL OF BUSINESS VENTURING OR INDUSTRY “AND” INNOVATION OR STRATEGIC ENTREPRENEURSHIP JOURNAL OR JOURNAL OF TECHNOLOGY TRANSFER OR JOURNAL OF ENGINEERING “AND” TECHNOLOGY MANAGEMENT OR JOURNAL OF MANAGEMENT OR JOURNAL OF MANAGEMENT STUDIES OR RESEARCH TECHNOLOGY MANAGEMENT OR ENTREPRENEURSHIP THEORY “AND” PRACTICE OR ACADEMY OF MANAGEMENT ANNALS OR ACADEMY OF MANAGEMENT PERSPECTIVES OR JOURNAL OF BUSINESS RESEARCH OR BRITISH JOURNAL OF MANAGEMENT OR EUROPEAN JOURNAL OF MANAGEMENT OR MANAGEMENT SCIENCE)

After exploring the results, I then downloaded the articles. These amounted to 1412 articles in total. Since WOS only allowed downloading of 500 at a time, I named these files 1-500.txt, 501-1000.txt and so on. I saved all the files in a folder (named Raw in this case) in my computer.

Data Exploration in Python

In the following, I show the code to import the data into Python and format the articles into a pandas dataframe.

import re, csv, os 
import pandas as pd
import numpy as np
import nltk
import math
%matplotlib inline
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style('white')
from collections import Counter

columnnames =['PT','AU','DE', 'AF','TI','SO','LA','DT','ID','AB','C1','RP','EM','CR','NR','TC','U1','PU','PI','PA','SN','EI','J9','JI','PD','PY','VL','IS','BP','EP','DI','PG','WC','SC','GA','UT']

def convertWOScsv(filename):
    openfile = open(filename, encoding='latin-1')
    sampledata = openfile.read()
    # divide into list of records 
    individualrecords = sampledata.split('\n\n')
    databaseofWOS = []
    for recordindividual in individualrecords:
        onefile = {}
        for x in columnnames:
            everyrow = re.compile('\n'+x + ' ' + '((.*?))\n[A-Z][A-Z1]', re.DOTALL)
            rowsdivision = everyrow.search(recordindividual)
            if rowsdivision:
                onefile[x] = rowsdivision.group(1)
        databaseofWOS.append(onefile)
    return databaseofWOS

def massconvertWOS(folder):
    publicationslist = []
    for file in os.listdir(folder):
        if file.endswith('.txt'):
            converttotable = convertWOScsv(folder + '\\' + file)
            publicationslist += converttotable
    publicationslist = pd.DataFrame(publicationslist)
    publicationslist.dropna(how='all', inplace=True)
    publicationslist.reset_index(drop=True, inplace=True)
    publicationslist['PY'] =publicationslist['PY'].fillna('').replace('', '2019').astype(int)
    publicationslist['TC'] = publicationslist['TC'].apply(lambda x: int(x.split('\n')[0]))
    return publicationslist

df = massconvertWOS('Raw')
df = df.drop_duplicates('UT').reset_index(drop=True)

I preview some of the articles that I was able to download below. I chose the relevant columns to show.

print('Number of Articles:', df.shape[0])
df.head()[['TI', 'AU', 'SO', 'PY']]
Number of Articles: 1412
TIAUSOPY
0Non-linear effects of technological competence…Deligianni, I\n Voudouris, I\n Spanos, Y\n…TECHNOVATION2019
1Creating new products from old ones: Consumer …Robson, K\n Wilson, M\n Pitt, LTECHNOVATION2019
2What company characteristics are associated wi…Koski, H\n Pajarinen, M\n Rouvinen, PINDUSTRY AND INNOVATION2019
3Through the Looking-Glass: The Impact of Regio…Vedula, S\n York, JG\n Corbett, ACJOURNAL OF MANAGEMENT STUDIES2019
4The role of incubators in overcoming technolog…Yusubova, A\n Andries, P\n Clarysse, BR & D MANAGEMENT2019

WOS is smart in the sense that even if the text does not contain the keywords you said, they still may include papers because they sense that these are relevant papers. To filter out these papers that did not contain the keywords I wanted, I further filtered the dataset by checking the title, abstract and author-selected keywords. Moreover, let’s remove articles without any citations.

df["txt"] = df["TI"].fillna("") + " " + df["DE"].fillna("") + " " + df["AB"].fillna("")
df["txt"] = df["txt"].apply(lambda x: x.replace('-', ' '))
df = df[df['txt'].apply(lambda x: any([y in x.lower() for y in ['scien', 'technolog']]))]
df = df[df['txt'].apply(lambda x: any([y in x.lower() for y in ['startup', 'start up', 'new venture', 'entrepreneur', 'new firm', 'spin off',
                                                                'spinoff', 'sme ', 'smes ']]))]
df = df[~df['CR'].isnull()] 
print('Number of Articles:', df.shape[0])
Number of Articles: 846

I can plot the number of articles over time

df.groupby('PY').size().plot(kind='bar')

I can look at the breakdown per journal

#df.groupby('SO').size().sort_values().plot(kind='barh', figsize=[5,10])
soplot = df.pivot_table(index='PY', columns='SO', aggfunc='size').fillna(0) #.reset_index()
soplot = soplot[soplot.sum(axis=0).sort_values().index].reset_index().rename(columns={'PY':'Year'})
soplot['Year'] = pd.cut(soplot['Year'], [0, 2014, 2019], labels=['2010-2014', '2015-2019'])
soplot.groupby('Year').sum().T.plot(kind='barh', stacked=True, figsize=[5,10])
plt.ylabel('Journal'), plt.xlabel('Number of Articles')
plt.show()

I can look at the top cited articles. This shows what are the foundational material that I should know before delving into the topic.

topcited = df['CR'].fillna('').apply(lambda x: [y.strip() for y in x.split('\n')]).sum()
pd.value_counts(topcited).head(10)
COHEN WM, 1990, ADMIN SCI QUART, V35, P128, DOI 10.2307/2393553                  115
Shane S, 2004, NEW HORIZ ENTREP, P1                                               88
Shane S, 2000, ACAD MANAGE REV, V25, P217, DOI 10.5465/amr.2000.2791611           87
Rothaermel FT, 2007, IND CORP CHANGE, V16, P691, DOI 10.1093/icc/dtm023           86
BARNEY J, 1991, J MANAGE, V17, P99, DOI 10.1177/014920639101700108                81
Shane S, 2000, ORGAN SCI, V11, P448, DOI 10.1287/orsc.11.4.448.14602              78
TEECE DJ, 1986, RES POLICY, V15, P285, DOI 10.1016/0048-7333(86)90027-2           77
Di Gregorio D, 2003, RES POLICY, V32, P209, DOI 10.1016/S0048-7333(02)00097-5     77
EISENHARDT KM, 1989, ACAD MANAGE REV, V14, P532, DOI 10.2307/258557               75
Nelson R.R., 1982, EVOLUTIONARY THEORY                                            69
dtype: int64

The articles above are not really very specific to our topic of interest. These are foundational papers in innovation/management. To explore those papers that are more relevant to our topic, what I can do then is find which is the most cited within the papers in this dataset, meaning hey include the keywords that I’m interested in. This is the internal citation of the papers.

def createinttc(df):
    df["CRparsed"] = df["CR"].fillna('').str.lower().astype(str)
    df["DI"] = df["DI"].fillna('').str.lower()
    df["intTC"] = df["DI"].apply(lambda x: sum([x in y for y in df["CRparsed"]]) if x!="" else 0)
    df["CRparsed"] = df["CR"].astype(str).apply(lambda x: [y.strip().lower() for y in x.split('\n')])
    return df

df = createinttc(df).reset_index(drop=True)
df.sort_values('intTC', ascending=False)[['TI', 'AU', 'SO', 'PY', 'intTC']].head(10)
TIAUSOPYintTC
40130 years after Bayh-Dole: Reassessing academic…Grimaldi, R\n Kenney, M\n Siegel, DS\n W…RESEARCH POLICY201145
301Academic engagement and commercialisation: A r…Perkmann, M\n Tartari, V\n McKelvey, M\n …RESEARCH POLICY201341
428Why do academics engage with industry? The ent…D’Este, P\n Perkmann, MJOURNAL OF TECHNOLOGY TRANSFER201132
402The impact of entrepreneurial capacity, experi…Clarysse, B\n Tartari, V\n Salter, ARESEARCH POLICY201126
407ENDOGENOUS GROWTH THROUGH KNOWLEDGE SPILLOVERS…Delmar, F\n Wennberg, K\n Hellerstedt, KSTRATEGIC ENTREPRENEURSHIP JOURNAL201124
430Entrepreneurial effectiveness of European univ…Van Looy, B\n Landoni, P\n Callaert, J\n …RESEARCH POLICY201123
398The Bayh-Dole Act and scientist entrepreneurshipAldridge, TT\n Audretsch, DRESEARCH POLICY201120
400The effectiveness of university knowledge spil…Wennberg, K\n Wiklund, J\n Wright, MRESEARCH POLICY201119
515Convergence or path dependency in policies to …Mustar, P\n Wright, MJOURNAL OF TECHNOLOGY TRANSFER201019
413Entrepreneurial Origin, Technological Knowledg…Clarysse, B\n Wright, M\n Van de Velde, EJOURNAL OF MANAGEMENT STUDIES201119

A complementary approach is to look at the articles that are citing the most the rest of the papers in the dataset. These allows us to see which reviews already integrates the studies within our dataset. We can then start reading from this set of papers as they cover already a lot of the other papers in the dataset.

doilist = [y for y in df['DI'].dropna().tolist() if y!='']
df['Citing'] = df['CR'].apply(lambda x: len([y for y in doilist if y in x]))
df.sort_values('Citing', ascending=False)[['TI', 'AU', 'SO' , 'PY',  'Citing', ]].head(10)
TIAUSOPYCiting
139Conceptualizing academic entrepreneurship ecos…Hayter, CS\n Nelson, AJ\n Zayed, S\n O’C…JOURNAL OF TECHNOLOGY TRANSFER201875
168THE PSYCHOLOGICAL FOUNDATIONS OF UNIVERSITY SC…Hmieleski, KM\n Powell, EEACADEMY OF MANAGEMENT PERSPECTIVES201833
138Re-thinking university spin-off: a critical li…Miranda, FJ\n Chamorro, A\n Rubio, SJOURNAL OF TECHNOLOGY TRANSFER201831
122Public policy for academic entrepreneurship in…Sandstrom, C\n Wennberg, K\n Wallin, MW\n …JOURNAL OF TECHNOLOGY TRANSFER201828
37Opening the black box of academic entrepreneur…Skute, ISCIENTOMETRICS201928
166RETHINKING THE COMMERCIALIZATION OF PUBLIC SCI…Fini, R\n Rasmussen, E\n Siegel, D\n Wik…ACADEMY OF MANAGEMENT PERSPECTIVES201825
68The technology transfer ecosystem in academia….Good, M\n Knockaert, M\n Soppe, B\n Wrig…TECHNOVATION201924
40Theories from the Lab: How Research on Science…Fini, R\n Rasmussen, E\n Wiklund, J\n Wr…JOURNAL OF MANAGEMENT STUDIES201922
659How can universities facilitate academic spin-…Rasmussen, E\n Wright, MJOURNAL OF TECHNOLOGY TRANSFER201521
73Stimulating academic patenting in a university…Backs, S\n Gunther, M\n Stummer, CJOURNAL OF TECHNOLOGY TRANSFER201921

Bibliometric Analysis in VosViewer

To create visualizations of the paper, we do the following steps. First, we can export the filtered dataset into a text file.

def convertWOStext(dataframe, outputtext):
    dataframe["PY"]=dataframe["PY"].astype(int)
    txtresult = ""
    for y in range(0, len(dataframe)):
        for x in columnnames:
            if dataframe[x].iloc[y] != np.nan:
                txtresult += x + " " + str(dataframe[x].iloc[y]) + "\n"
        txtresult += "ER\n\n"
    f = open(outputtext, "w", encoding='utf-8')
    f.write(txtresult)
    f.close()

convertWOStext(df, 'df.txt')

We can then open the file in VosViewer. From there, we can create various visualizations. I like using bibliographic coupling to map all the papers in the dataset

I saved the file in VosViewer. This gives you two files, one has the data on each document and the second file has the network data. We modify these files to make certain changes. First, the citations above reflect their citations from all the papers outside the dataset. I want the internal citations to be shown so I replace it.

def createvosfile1(filename, df, updatecit= False, newclusters = False, newname=None):
    vosfile1  = pd.read_csv(filename, sep="\t")
    voscolumns = vosfile1.columns
    vosfile1["title"] = vosfile1["description"].apply(lambda x: x.split("Title:</td><td>")[1])
    vosfile1["title"] = vosfile1["title"].apply(lambda x: x.split("</td></tr>")[0])
    df["TI2"] = df["TI"].apply(lambda x: " ".join(x.lower().split()))
    vosfile1 = vosfile1.merge(df[[x for x in df.columns if x not in voscolumns]], how="left", left_on="title", right_on="TI2")
    vosfile1["txt"] = vosfile1["TI"].fillna(" ") + " " + vosfile1["DE"].fillna(" ") + " " + vosfile1["AB"].fillna(" ")  
    vosfile1["txt"] = vosfile1["txt"].apply(lambda x: x.lower())
    vosfile1["weight<Citations>"] = vosfile1["intTC"].fillna(0)
    vosfile1 = vosfile1.drop_duplicates('id')
    vosfile1['id'] = vosfile1.reset_index().index + 1
    if newclusters == True:
        vosfile1['cluster'] = artclusters
    if updatecit == True:
        vosfile1[voscolumns].to_csv(newname, sep="\t", index=False)
    return vosfile1

df = createvosfile1('Processed\VosViewer_1_Original.txt', df, newname='Processed\VosViewer_1_intCit.txt', updatecit= True, newclusters=False)

The above network just uses the citation data of the publications. To improve it, I like integrating the textual data from the title, abstract and keywords. I followed the steps suggested here for cleaning the text (https://www.analyticsvidhya.com/blog/2016/08/beginners-guide-to-topic-modeling-in-python/). I then combine these two measures to allow for hybrid clustering

Liu, Xinhai, Shi Yu, Frizo Janssens, Wolfgang Glänzel, Yves Moreau, and Bart De Moor. “Weighted hybrid clustering by combining text mining and bibliometrics on a large‐scale journal database.” Journal of the American Society for Information Science and Technology 61, no. 6 (2010): 1105-1119.

#Bibliometric coupling
from scipy.sparse import coo_matrix
from collections import Counter
from sklearn.metrics.pairwise import cosine_similarity

def createbibnet(df):
    allsources = Counter(df['CRparsed'].sum())
    allsources  = [x for x in allsources if allsources[x]>1]
    dfcr = df['CRparsed'].reset_index(drop=True)
    dfnet = []
    i=0
    for n in allsources:
        [dfnet.append([i, y]) for y in dfcr[dfcr.apply(lambda x: n in x)].index]
        i+=1
    dfnet_matrix = coo_matrix(([1] * len(dfnet), ([x[1] for x in dfnet], [x[0] for x in dfnet])), 
                              shape=(dfcr.shape[0], len(allsources)))
    return cosine_similarity(dfnet_matrix, dfnet_matrix)

#Lexical Coupling
from nltk.corpus import stopwords 
from nltk.stem.wordnet import WordNetLemmatizer
from sklearn.feature_extraction.text import TfidfVectorizer
import string
from gensim.models.phrases import Phrases, Phraser

def clean(doc):
    stop = set(stopwords.words('english'))
    exclude = set(string.punctuation) 
    lemma = WordNetLemmatizer()
    stop_free = " ".join([i for i in doc.lower().split() if i not in stop])
    punc_free = ''.join(ch for ch in stop_free if ch not in exclude)
    normalized = " ".join(lemma.lemmatize(word) for word in punc_free.split())
    normalized = " ".join([x for x in normalized.split() if not any(c.isdigit() for c in x)])
    normalized = " ".join([x for x in normalized.split() if len(x)>3])
    return normalized

def bigrams(docs):
    phrases = Phrases(docs)
    bigram = Phraser(phrases)
    docs = docs.apply(lambda x: bigram[x])
    phrases = Phrases(docs)
    trigram = Phraser(phrases)
    docs = docs.apply(lambda x: trigram[x])
    return docs

def createtfidf(df, sheet_name):
    df["lemma"] = df["txt"].apply(lambda x: clean(x).split())
    df["lemma"] = bigrams(df["lemma"])
    vect = TfidfVectorizer(min_df=1)
    tfidftemp = vect.fit_transform([" ".join(x) for x in df["lemma"]])
    return cosine_similarity(tfidftemp) 

#Hybrid network
def createhybridnet(df, weightlex, sheet_name='Sheet1'):
    bibnet = createbibnet(df)
    tfidftemp = createtfidf(df, sheet_name)
    hybnet = pd.DataFrame(np.cos((1-weightlex) * np.arccos(bibnet) + weightlex *  np.arccos(tfidftemp))).fillna(0)
    return hybnet

from itertools import combinations
def createvosviewer2filefromhybrid(hybridlexcit, minimumlink, outputfilename):
    forvisuals = []
    for x, y in combinations(hybridlexcit.index, 2):
        val = int(hybridlexcit.loc[x,y]*100)
        if val > minimumlink:
            forvisuals.append([x, y, val])
    forvisuals = pd.DataFrame(forvisuals)
    forvisuals[0] = forvisuals[0] + 1
    forvisuals[1] = forvisuals[1] + 1
    forvisuals.to_csv(outputfilename, index=False, header=False)
    
dfhybrid = createhybridnet(df, 0.5)
createvosviewer2filefromhybrid(dfhybrid, 0, r'Processed/VosViewer_2_Hybrid.txt')

If we reimport these modified files to VosViewer. We come up with this visualization which incorporates both textual and citation data.

I can then spend tons of time just exploring the network. I look at the papers in each cluster. I check which papers have high citations. I can do this also with the help of python. We can update the clustering using the one generated by VosViewer.

df = createvosfile1('Processed/VosViewer_1_Clus.txt', df)
df[df['cluster']==1].sort_values('intTC', ascending=False)[['TI', 'AU', 'SO', 'PY', 'intTC']].head(10)
TIAUSOPYintTC
404ENDOGENOUS GROWTH THROUGH KNOWLEDGE SPILLOVERS…Delmar, F\n Wennberg, K\n Hellerstedt, KSTRATEGIC ENTREPRENEURSHIP JOURNAL201124
500Cognitive Processes of Opportunity Recognition…Gregoire, DA\n Barr, PS\n Shepherd, DAORGANIZATION SCIENCE201011
439Managing knowledge assets under conditions of …Allarakhia, M\n Steven, WTECHNOVATION201110
353Technology entrepreneurshipBeckman, C\n Eisenhardt, K\n Kotha, S\n …STRATEGIC ENTREPRENEURSHIP JOURNAL20129
484IAMOT and Education: Defining a Technology and…Yanez, M\n Khalil, TM\n Walsh, STTECHNOVATION20108
343TECHNOLOGY-MARKET COMBINATIONS AND THE IDENTIF…Gregoire, DA\n Shepherd, DAACADEMY OF MANAGEMENT JOURNAL20128
443The Strategy-Technology Firm Fit Audit: A guid…Walsh, ST\n Linton, JDTECHNOLOGICAL FORECASTING AND SOCIAL CHANGE20118
411The Cognitive Perspective in Entrepreneurship:…Gregoire, DA\n Corbett, AC\n McMullen, JSJOURNAL OF MANAGEMENT STUDIES20118
596Technology Business Incubation: An overview of…Mian, S\n Lamine, W\n Fayolle, ATECHNOVATION20166
303Local responses to global technological change…Fink, M\n Lang, R\n Harms, RTECHNOLOGICAL FORECASTING AND SOCIAL CHANGE20136
df[df['cluster']==2].sort_values('intTC', ascending=False)[['TI', 'AU', 'SO', 'PY', 'intTC']].head(10)
TIAUSOPYintTC
410Entrepreneurial Origin, Technological Knowledg…Clarysse, B\n Wright, M\n Van de Velde, EJOURNAL OF MANAGEMENT STUDIES201119
461On growth drivers of high-tech start-ups: Expl…Colombo, MG\n Grilli, LJOURNAL OF BUSINESS VENTURING201017
387WHEN DOES CORPORATE VENTURE CAPITAL ADD VALUE …Park, HD\n Steensma, HKSTRATEGIC MANAGEMENT JOURNAL201210
514The M&A dynamics of European science-based ent…Bonardo, D\n Paleari, S\n Vismara, SJOURNAL OF TECHNOLOGY TRANSFER20109
506The role of incubator interactions in assistin…Scillitoe, JL\n Chakrabarti, AKTECHNOVATION20109
423EXPLAINING GROWTH PATHS OF YOUNG TECHNOLOGY-BA…Clarysse, B\n Bruneel, J\n Wright, MSTRATEGIC ENTREPRENEURSHIP JOURNAL20119
574CHANGING WITH THE TIMES: AN INTEGRATED VIEW OF…Fisher, G\n Kotha, S\n Lahiri, AACADEMY OF MANAGEMENT REVIEW20169
354Amphibious entrepreneurs and the emergence of …Powell, WW\n Sandholtz, KWSTRATEGIC ENTREPRENEURSHIP JOURNAL20128
507A longitudinal study of success and failure am…Gurdon, MA\n Samsom, KJTECHNOVATION20108
324Are You Experienced or Are You Talented?: When…Eesley, CE\n Roberts, EBSTRATEGIC ENTREPRENEURSHIP JOURNAL20128
df[df['cluster']==3].sort_values('intTC', ascending=False)[['TI', 'AU', 'SO', 'PY', 'intTC']].head(10)
TIAUSOPYintTC
39830 years after Bayh-Dole: Reassessing academic…Grimaldi, R\n Kenney, M\n Siegel, DS\n W…RESEARCH POLICY201145
298Academic engagement and commercialisation: A r…Perkmann, M\n Tartari, V\n McKelvey, M\n …RESEARCH POLICY201341
425Why do academics engage with industry? The ent…D’Este, P\n Perkmann, MJOURNAL OF TECHNOLOGY TRANSFER201132
399The impact of entrepreneurial capacity, experi…Clarysse, B\n Tartari, V\n Salter, ARESEARCH POLICY201126
427Entrepreneurial effectiveness of European univ…Van Looy, B\n Landoni, P\n Callaert, J\n …RESEARCH POLICY201123
395The Bayh-Dole Act and scientist entrepreneurshipAldridge, TT\n Audretsch, DRESEARCH POLICY201120
397The effectiveness of university knowledge spil…Wennberg, K\n Wiklund, J\n Wright, MRESEARCH POLICY201119
392What motivates academic scientists to engage i…Lam, ARESEARCH POLICY201119
511Convergence or path dependency in policies to …Mustar, P\n Wright, MJOURNAL OF TECHNOLOGY TRANSFER201019
479A knowledge-based typology of university spin-…Bathelt, H\n Kogler, DF\n Munro, AKTECHNOVATION201018

Asking SkyNet Management Questions

The past week there has been a lot of fuss over the release of a language model by OpenAI which was supposed to be the most advanced yet. There is a lot of discussion online too because of the team’s decision to not release the full version of the model, afraid that it will be misused by Russian trolls, Chinese spies or whatever. Nonetheless, the partial model has been available for researchers to tinker with. To make things more convenient, there has also been an online implementation to skip the troubles of having to install the model on my computer.

I played with it to try to find out if it would help me write my doctoral dissertation. Maybe not, but at least give me some inspiration. These are my queries followed by the model’s continuation:

Prompt: Absorptive capacity is important for firms to gain competitive advantage.

A high birth rate reduces labor size relative to that in the United States (approximately 2.5 births per woman per 100 that would be generated in the United States in 2025) by the same method that causes individual WL to lag the rest of the world as a comparable technology (James and Wilcox 2005; Steuber and Turnbull 2006). Moreover, this concept establishes that WL may very well be a competitive advantage in conduct metrics secondary to WL. Thus, in conclusion, an incomplete equilibrium system often poses the problem of generating the next big PBE, ready for limiting these situations…

Prompt: To create disruptive innovations, companies should…

not work in the present day and if they do: “We’re not picking those lines for half-measures. ” Trust Poopmakers, Congress The hard law enforcement obligation imposed by the U. S. government on startups and small businesses should be absolute, followed by the next step. However, the United States should not violate copyright as being without regard to collateral damage and even have some credentials as a custodian of such equivalent works. Counterproductive courts should inquire into the bill from both sides and pass amendments for the immediate reporting of the U. S. to the American people…

The results have been disappointing unfortunately. We are not yet at a point that doctoral students can be replaced by actual robot slaves.

Conforming vs. Differentiating in Everything

In the first month of my PhD, I came across the article by Hambrick and Chen on how academic fields develop. Briefly, they described three processes that fields how to go through to receive acceptance: differentiation, legitimation and mobilization. The first differentiation means that a field should try to set itself apart from other existing fields. At the same time however, it should not stray too far away as it still have to build legitimacy and gain recognition from the wider scientific community. Mobilization refers to the field’s ability to mobilize resources. As I see it, this last process basically serve as the fuel to advance the two other processes. During those days, I found this study to be interesting but set it aside for the next two years not knowing how to incorporate it in my work.

Differentiation, legitimacy building and mobilization

How new fields gain acceptance

In the past few months, I’ve been thinking of a way to unify the different studies that I’ve been conducting for my PhD. One day, I randomly came across the article again. I then realized that the dynamic of conforming and differentiation occur everywhere. It does not only occur at the field level but also in other levels; not only in scientific development but also in many other facets of life. At the individual level, all of us, in one way or another, conform to the communities we belong to while at the same time, try to make ourselves stand out. To contribute, we try to bring something new or unique to the groups that we are in. Upon looking further, this has been referred to as the theory of optimal distinctiveness in psychology circles, which has been described even as early as 1991.

Recently, this concept has also been gaining more interest at the organizational level, such as this review by Zhao et al. Firms cannot compete by only conforming with other firms in their market, they also have to differentiate themselves from other players in their area. As there can be different dimensions of comparison across firms, the balancing act between conforming and differentiating can be complex. Research then is of value to explore how to orchestrate such dynamics effectively.

Studying in the past two years how new scientific fields develop, I notice the recurring theme of conformation vs. differentiation. Fields have to manage these forces if they want to be established. Big firms have to be aware of these forces if they want to stay relevant. New entrepreneurial firms also have to balance the two to gain resources and find customers. Teams within  firms, to stay innovative, have to be in touch with what their colleagues are doing while at the same time, bring new things to the table. Researchers and managers also have to practice optimal distinctiveness.