Today I want to introduce a 2nd company in this domain: Lex Machina. This one is much closer to home (literrally). It is based in Palo Alto, founded by a couple of Stanford professors and students, one from Law school and one from computer science department. Their product is patent litigation analysis, completely automated.
Given a lot of patent lawsuits around this country, the company creates a daily index and searchable database on major lawsuits. The value of tracing patent lawsuit is apparent to many large companies, as we can tell from Apple and Samsung’s fierce fight. Back in late 1990s, eBay lost a patent litigation on an obscure auction algorithm written by unknown person, even after spending a large sum on lawyers and expert witnesses. Not to mention Google acquired Motorola mainly for its large patent portfolio.
Lex Machina has a web crawler that crawls major court case websites and bring back the document and index them. I can see 2 major techniques from NLP used here: (1) Named Entity Recognition (2) Text classification (with supervised learning).
Beyond these techniques, this company also converts the documents by optical character recognition (OCR) to searchable text and stores each one as a PDF file. In addition, the crawl is done on a few major litigation websites, making the problem tractable.
I look forward to seeing the succcess of Lex Machina.