
By Fahad Usman
Last time, I showed you how to get started with Stanford core openNLP. This tutorial is about Apache openNLP. Here are some of the core features of openNLP:
Features of OpenNLP
Named Entity Recognition (NER) − Using NER, you can extract names of locations, people etc. in a given text.
Summarise − summarise Paragraphs, articles, documents or their collection in NLP.
Searching − Search using a given string and also extract its synonyms, even though the given word is altered or misspelled.
Tagging (POS) − Divide the text into various grammatical elements for further analysis.
Translation − Translate one language into another.
Information grouping − Group textual information in the content of the document, just like Parts of speech.
Natural Language Generation − It is used for generating information from a database and automating the information reports such as weather analysis or medical reports.
Feedback Analysis − Do you collect survey verbatim about product or services? Analyse how well the service or the product is doing.
Speech recognition − openNLP has some builtin features for this requirement.
- Eclipse (IDE for Java Developers)
- JDK (I am using the latest JDK 11)
File -> New -> Project
Give it a name and hit finish
“OpenNLP supports the most common NLP tasks, such as tokenisation, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, language detection and coreference resolution.”
Select Java Project and give it a name and hit finish

Configure -> Convert to Maven Project
Let it do its thing and it will open up the POM.xml file.
Add the following lines to POM.xml
OpenNLP Tools Dependency
<dependency>
<groupId>org.apache.opennlp</groupId>
<artifactId>opennlp-tools</artifactId>
<version>1.9.0</version>
</dependency>
OpenNLP UIMA Annotators Dependency
<dependency>
<groupId>org.apache.opennlp</groupId>
<artifactId>opennlp-uima</artifactId>
<version>1.9.0</version>
</dependency>
Hit Save.
Now head to opennlp site and download pre-built model bin files. I download one (en-sent.bin) just to test.
Download and save it in your working directory in a new folder called
OpenNLP_models
Now head back to your eclipse. Right click on the src folder and select New -> package and create a new package in the src directory then right click on this select New -> class and create a new class named “SentenceDetection_RE” and paste the following code:package openNLP;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import opennlp.tools.sentdetect.SentenceDetectorME;
import opennlp.tools.sentdetect.SentenceModel;
import opennlp.tools.util.Span;
public class SentenceDetection_RE {
public static void main(String[] args) throws FileNotFoundException {
String paragraph = "Hi. How are you? Welcome to FahadUsman.com. "
+ "I provide free tutorials on various technologies";
//Loading sentence detector model
InputStream inputStream = new FileInputStream("/Users/Fahad/eclipse-workspace/openNLP/OpenNLP_models/en-sent.bin");
SentenceModel model = null;
try {
model = new SentenceModel(inputStream);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
//Instantiating the SentenceDetectorME class
SentenceDetectorME detector = new SentenceDetectorME(model);
//Detecting the position of the sentences in the raw text
Span spans[] = detector.sentPosDetect(paragraph);
//Printing the spans of the sentences in the paragraph
for (Span span : spans)
System.out.println(span);
}
}
Once done. Run it as Java Application and you should see the following output in the console:
That’s it! Enjoy and good luck making your own custom built models! 🙂