Data Visualization, Patents, and Innovation
As a postdoctoral fellow at the Coleman Fung Institute for Engineering Leadership in UC Berkeley’s College of Engineering, I had the privilege of joining a group of scholars applying ongoing research in data visualization to the history of science/technology. We used an incredible database of US patent applications and grants to study intellectual property law and policy, innovation, and technology in society.
Patent Co-Invention Network Tool
As part of our work understanding a particular breakthrough technology, we wanted to see the relationships of inventors working together that made it possible. Our visualization specialist, Guan-Cheng Li, created a live-rendering social network diagram (with links between inventors representing that they were both listed as inventors on the same patent in the chosen time frame) for some sample data we manually gathered.I then programmed an automated tool (using PHP/MySQL) that creates these social network diagrams for any chosen inventor(s) or technological areas, in any given time frame (since 1975, since our data begins then). It can classify inventors by their affiliations (businesses or universities they worked for), or by affiliation with “academia” or “industry” in order to explore academic invention’s evolving role in industrial science. That tool is available here:
Patent Co-Invention Network Tool
The following sample images represent the networks surrounding a set of professors at UC Berkeley, who together invented a particularly important semiconductor technology in the late 1990s:
Inventors, co-inventors, and co-co-inventors within 1990-2001
Inventors, co-inventors, co-co-inventors, and co-co-co-inventors within 1996-2001
Here is the code that generates this, with login credentials to the databases removed:
For examples of the type of work with which I will be / am engaging, see the blog of Lee Fleming, the Faculty Director of the Fung Institute: http://www.funginstitute.berkeley.edu/blog-categories/faculty-directors-blog#One tool we are developing is the “tech flow” visualization, using semantic comparison of patent contents to chart linguistically-similar patents. The tool is under development, but here is a sample visualization (click for full-size):
Digital Tools for Outreach
Undergraduate Research at the Virginia Center for Digital History
As an undergraduate student at the University of Virginia, I worked for two summers (2004 and 2006) as a digital history researcher at the Virginia Center for Digital History. Working under the guidance of scholars such as Ed Ayers, Clark Scott Nesbit, Jr. (@csnesbit), and Andrew Torget (@andrewtorget), I was able to contribute to a number of early digital humanities projects.
The Valley of the Shadow Project (http://valley.lib.virginia.edu/)
“The Valley of the Shadow Project details life in two American communities, one Northern and one Southern, from the time of John Brown’s Raid through the era of Reconstruction. In this digital archive you may explore thousands of original letters and diaries, newspapers and speeches, census and church records, left by men and women in Augusta County, Virginia, and Franklin County, Pennsylvania. Giving voice to hundreds of individual people, the Valley Project tells forgotten stories of life during the era of the Civil War.”
The Valley Project is one of the pioneering projects in the field of digital history, and certainly predated my involvement (it originated in the early 1990s). I helped with acquiring, processing, transcribing, coding, verifying, and summarizing documents including letters, diaries, and newspaper articles.
The Disrespect Index: College Basketball vs. the Spread
Curious about how often my team was beating the spread in college basketball, I put together a site that ranks all college basketball teams by this metric: http://disrespectindex.com.
This involved scraping historical closing odds, processing the data in Python, inserting it into a MySQL database, then processing requests using PHP. The results provide a unique look into how gamblers’ collective expectations match up to reality, and reveal that a number of “Cinderella run” teams were really grossly underrated throughout the season, even by those willing to put money on the line.