Credit: Democracy Chronicles via Flickr
If you take a look at the list of trending repositories on GitHub, you’ll see amazing code from programmers who live around the world and efforts for firms big and small. But one thing you don’t often see is work that comes from the university labs. It’s rare for the next big thing to escape from an academic computer science department and capture the attention of the world.
That’s not a knock on university research. But competing with open source projects that enjoy broad support across the industry and around the world is challenging for a handful of academics and grad students. Sure, many of the top computer science schools are well off, but that doesn’t mean the money is pouring into research. Open source programmers, on the other hand, can usually build better code faster, often because their have bosses who pay them to build something that will pay off next quarter, not next century.
Yet good computer science departments still manage to punch above -- sometimes well above -- their weight. While a good part of the research is devoted to arcane topics like the philosophical limits of computation, some of it can be tremendously useful for the world at large.
What follows are nine projects currently under development at university labs that are worth your attention. They may not be the absolute best or furthest along, but each has the potential to have a broad impact on the world of computing. Some offer shipping code, others offer mostly potential, but all offer a straightforward path for transforming our world with useful computation.
Big data is one area where academia’s focus on mathematical foundations can pay off, and one of the more prominent packages to gain attention of late is DeepDive, a tool for exploring unstructured text. While many big data projects work with well-structured information that’s already in tables, DeepDive focuses on finding correlations in raw text files and other files that aren’t organized.
The Java code runs a pipeline that pushes the raw data through a set of tools that parses natural language into streams of entities -- that is, people, places, companies, or things. Then it uses statistical algorithms to search for connections among the entities, even if they’re not explicitly spelled out. These results are then boiled down to clear inferences and inserted into an old-school database.
The results vary depending upon the style of the text, the nature of the query, and the clarity of the writing, but in good circumstances the tool can deliver better results than humans can. The developers even report that some studies have shown that DeepDive “exceeded the quality of human volunteer annotators in both precision and recall for complex scientific articles.”