IBM is responding to that challenge through the open source community by publishing the source code of the Unstructured Information Management Architecture (UIMA), which provides a foundation for new technologies that will unlock the value in content. By providing access to UIMA, IBM hopes to create a sort of IT lingua franca for technology in the field.
"Companies want to get value from all of their information, but no single vendor can address all of the search, text analytics and business insight needs across all types of information and for all industries," said Nelson Mattos, vice president of Information and Interaction, IBM Research. "We are making UIMA available to the open source community to encourage innovation and allow analytics software tools from multiple sources to work together and build upon each other."
In fact, UIMA is already in use. Mayo Clinic used UIMA to help extract knowledge from approximately 20 million clinical notes. And, the Defense Advanced Research Projects Agency (DARPA) is using it as part of a new program designed to absorb, analyze and interpret huge volumes of speech and text in multiple languages and provide distilled information in English.