David Talby, John Snow Labs
People who generally work with data tend to think in very structured, linear terms. They’d expect B to follow A and C to follow B—not just some of the time, but all of it. Healthcare data doesn’t work that way. It is both diverse and complex, including both well-structured datasets (diagnostic test results or coded data) and an extensive amount of unstructured data that remains in free-text narratives.
Holding critical information about symptom profiles and patient-wise treatment plans and risk factors, this unstructured data can support crucial clinical decisions and open up new windows for R&D. The recent pervasiveness of electronic medical records (EMR) makes this massive amount of data digitally available, which allows healthcare organizations to streamline their operations. However, the task of analyzing unstructured free-text clinical data is cumbersome owing to the complexities of hyper-local jargon, acronyms, clinical guidelines, and unwritten assumptions – to name a few. What healthcare organizations require today is a sound NLP solution that can help them meet their data-related needs including the critical clinical facts that exist as free text in their EMR.
Set against this background, John Snow Labs—a healthcare AI company—has emerged as the best NLP solution provider. Named after the great physician, Dr. John Snow, who stopped the cholera outbreak in Victorian England, the company is revolutionizing the healthcare and life sciences industry with its AI products and services. John Snow Labs comprises a global team of specialists, of which a third percent hold a Ph.D. or M.D. and three quarters hold at least a Master’s degree, in various disciplines such as data science, medicine, software engineering, pharma, information security, and DataOps.
The Confluence of Technical Expertise and Industry Experience
John Snow Labs is widely recognized as the developer of the open-source NLP Library that equips the open-source community with state-of-the-art NLP capabilities. Spark NLP is the only natively distributed open-source text-processing library Python, Java, and Scala with the full functionality of traditional NLP libraries (like spaCy, nltk, Stanford CoreNLP, and Open NLP) and adds additional functionality such as spell checking, sentiment analysis, and OCR. “Spark NLP ships with over 150 pre-trained pipelines and models, enabling users to get things done within minutes and a few lines of code,” comments David Talby, CTO at John Snow Labs.
Spark NLP provides production-grade versions of the latest research in NLP, ensuring that customers and the open-source community continuously get upgraded to the most accurate algorithms & models as they are invented. “Our team regularly reads the latest academic papers in this area and productizes the most accurate reproducible results. We aim to provide the production-grade, trainable, and scalable implementations of the best techniques available,” elucidates Talby.
Spark NLP also places a major emphasis on speed and scalability.
By all accounts, John Snow Labs has created the most accurate software in history to extract facts from unstructured text
It is the only open-source library that is natively scalable, and ensures zero code changes to scale a pipeline to any Spark cluster. It has also been heavily optimized for single machine use on modern hardware: Spark NLP can leverage both GPU and current Intel Xeon processors to their maximum potential.
Spark NLP has a global following – with a recent analysis finding that of people interested in the library 44% are from the Americas, 24% from Asia-Pacific, and the remaining 22% are based in the EMEA region. Spark NLP now provides out-of-the-box support for more than twenty languages are available – including German, Spanish, Russian, Turkish, Portuguese, and French. The team’s goal is to provide free, open-source, state-of-the-art natural language processing models for all the widely spoken languages on earth.
The Healthcare AI Platform
John Snow Labs also offers a proprietary end-to-end Healthcare AI Platform that provides data integration, data exploration & visualization, model development, and model deployment – all within a unified, hardened, air-gap managed cluster. “The platform is Kubernetes-based and can, therefore, be deployed anywhere. We have production deployments on AWS, Azure, and on-premise, and it is designed from the ground up to process personal health information in a secure and compliant way,” says Talby. The AI platform supports hundreds of security controls, central identity management and role-based access control, encryption and key management, 360-degree monitoring, scaling, and auto-scaling capabilities.
AI model governance is another significant component of the platform. It provides the ability to deploy models as APIs with one line of code and enables model versioning, rollback and rollout capabilities, authentication & authorization, a full audit trail, and reproducible model training.
Growth Propelled by Exemplary Success Stories
The unique value proposition of the Spark NLP Library and the AI platform, coupled with John Snow Labs’ domain and technology expertise has enabled it to work with some of the world’s largest, most risk-averse companies in the healthcare and life sciences space—across oncology, neurology, orthopedics, radiology, pathology, emergency medicine, home health, hospice care, and mental health.
In one instance, the company worked with Roche— the world’s second largest pharmaceutical company. As part of building its Navify clinical decision support platform, Roche Diagnostic Information Solutions faced challenges pertaining to patient risk prediction, cohort selection, and clinical decision support owing to unstructured free-text reports that included most of the pertinent facts. There was a strong need to unlock unstructured data to build a comprehensive longitudinal view of the patient for both clinical decision support and population analytics. After thorough vetting, John Snow Labs was chosen to help build a solution that can extract structured clinical facts from messy, multi-page documents, assign a reference to the role each term has in a sentence, and train a model to extract more than 45 entities from pathology & radiology reports.
In another instance, John Snow Labs was chosen by SelectData to help build its AI Platform for high-quality clinical coding, audit, and clinical documentation improvement (CDI). This solution required interpreting millions of patient stories, some well over 100 pages long,including the ability to OCR these records from scanned documents, de-identify them, and then build a semi-automated workflow that seamlessly combines AI and human experts.
Backed by the prowess of many such intriguing success stories, John Snow Labs has established itself as a thought leader in the Healthcare AI & NLP space. In fact, the company regularly partakes in top AI industry conferences like Strata Data, Spark+AI Summit, O’Reilly AI, Open Data Science, Predictive Analytics World, Global AI Conference, to name a few.
John Snow Labs is all set to break new grounds and help many more healthcare entities to realize this potential for improving healthcare and human well-being in the 21st century. Striding ahead, the company will put the latest academic research, novel innovations, and AI best practices into their customers’ hands.
“John Snow Labs exists to help put AI to good use faster. That is why the company championed Data Philanthropy – sharing valuable private data to help solve community needs – from its very first month. That is why we’ve grown our commitment and giving to Open Source and Open Data every year. Nowadays, it is the reason we provide free access to all our software for teams fighting the COVID-19 pandemic,” concludes Talby.