下载安卓APP箭头
箭头给我发消息

客服QQ:3315713922
读书 > 编程语言 >Java > Java自然语言处理(影印版)

Java自然语言处理(影印版)

综合评级:★★★★★

定价:56.00

作者:(英)里斯 著

出版社:东南大学出版社

出版日期:2016年1月

页数:237

字数:318000

ISBN:9787564160883

书籍介绍
自然语言处理(NLP)是应用开发中的重要领域之 一,其与解决当代问题的相关性将与日俱增。对于它 通过NLP任务支持实现的自然语言可访问应用的需求 已有显*增长。里斯编写的《Java自然语言处理(影 印版)(英文版)》将运用诸如全文检索、合适名称识 别、聚类、标签、信息抽取和摘要等手段展示如何自 动组织文本。本书介绍了各种NLP概念,即便你没有 任何统计学自然语言处理背景也能理解。

  自然语言处理(NLP)是应用开发中的重要领域之 一,其与解决当代问题的相关性将与日俱增。对于它 通过NLP任务支持实现的自然语言可访问应用的需求 已有显*增长。里斯编写的《Java自然语言处理(影 印版)(英文版)》将运用诸如全文检索、合适名称识 别、聚类、标签、信息抽取和摘要等手段展示如何自 动组织文本。本书介绍了各种NLP概念,即便你没有 任何统计学自然语言处理背景也能理解。

相关课程
目录
Preface

Chapter 1: Introduction to NLP

What is NLP?

Why use NLP?

Why is NLP so hard?

Survey of NLP tools

Apache OpenNLP

Stanford NLP

LingPipe

GATE

UIMA

Overview of text processing tasks

Finding parts of text

Finding sentences

Finding people and things

Detecting Parts of Speech

Classifying text and documents

Extracting relationshiPS

Using combined approaches

Understanding NLP models

Identifying the task

Selecting a model

Building and training the model

Verifying the model

Using the model

Preparing data

Summary

Chapter 2: Finding Parts of Text

Understanding the parts of text

What is tokenization?

Uses of tokenizers

Simple java tokenizers

Using the Scanner class

Specifying the delimiter

Using the split method

Using the Breaklterator class

Using the StreamTokenizer class

Using the StringTokenizer class

Performance considerations with java core tokenization

NLP tokenizer APIs

Using the OpenNLPTokenizer class

Using the SimpleTokenizer class

Using the WhitespaceTokenizer class

Using the TokenizerME class

Using the Stanford tokenizer

Using the PTBTokenizer class

Using the DocumentPreprocessor class

Using a pipeline

Using LingPipe tokenizers

Training a tokenizer to find parts of text

Comparing tokenizers

Understanding normalization

Converting to lowercase

Removing stopwords

Creating a StopWords class

Using LingPipe to remove stopwords

Using stemming

Using the Porter Stemmer

Stemming with LingPipe

Using lemmatization

Using the StanfordLemmatizer class

Using lemmatization in OpenNLP

Normalizing using a pipeline

Summary

Chapter 3: Finding Sentences

The SBD process

What makes SBD difficult?

Understanding SBD rules of LingPipe's

HeuristicSentenceModel class

Simple Java SBDs

Using regular expressions

Using the Breaklterator class

Using NLP APIs

Using OpenNLP

Using the SentenceDetectorME class

Using the sentPosDetect method

Using the Stanford API

Using the PTBTokenizer class

Using the DocumentPreprocessor class

Using the StanfordCoreNLP class

Using LingPipe

Using the IndoEuropeanSentenceModel class

Using the SentenceChunker class

Using the MedlineSentenceModel class

Training a Sentence Detector model

Using the Trained model

Evaluating the model using the SentenceDetectorEvaluator class

Summary

Chapter 4: Finding People and Things

Why NER is difficult?

Techniques for name recognition

Lists and regular expressions

Statistical classifiers

Using regular expressions for NER

Using Java's regular expressions to find entities

Using LingPipe's RegExChunker class

Using NLP APIs

Using OpenNLP for NER

Determining the accuracy of the entity

Using other entity types

Processing multiple entity types

Using the Stanford API for NER

Using LingPipe for NER

Using LingPipe's name entity models

Using the ExactDictionaryChunker class

Training a model

Evaluating a model

Summary

Chapter 5: Detecting Parts of Speech

The tagging process

Importance of POS taggers

What makes POS difficult?

Using the NLP APIs

Using OpenNLP POS taggers

Using the OpenNLP POSTaggerME class for POS taggers

Using OpenNLP chunking

Using the POSDictionary class

Using Stanford POS taggers

Using Stanford MaxentTagger

Using the MaxentTagger class to tag textese

Using Stanford pipeline to perform tagging

Using LingPipe POS taggers

Using the HmmDecoder class with BestFirst tags

Using the HmmDecoder class with NBest tags

Determining tag confidence with the HmmDecoder class

Training the OpenNLP POSModel

Summary

Chapter 6: Classi ify_~g_ Texts and Documents

How classification is used

Understanding sentiment analysis

Text classifying techniques

Using APIs to classify text

Using OpenNLP

Training an OpenNLP classification model

Using DocumentCategorizerME to classify text

Using Stanford API

Using the ColumnDataClassifier class for classification

Using the Stanford pipeline to perform sentiment analysis

Using LingPipe to classify text

Training text using the Classified class

Using other training categories

Classifying text using LingPipe

Sentiment analysis using LingPipe

Language identification using LingPipe

Summary

Chapter 7: Using Parser to Extract Relationships

Relationship types

Understanding parse trees

Using extracted relationships

Extracting relationships

Using NLP APIs

Using OpenNLP

Using the Stanford API

Using the LexicalizedParser class

Using the TreePrint class

Finding word dependencies using the GrammaticalStructure class

Finding coreference resolution entities

Extracting relationships for a question-answer system

Finding the word dependencies

Determining the question type

Searching for the answer

Summary

Chapter 8: Combined Approaches

Preparing data

Using Boilerpipe to extract text from html

Using POI to extract text from Word documents

Using PDFBox to extract text from PDF documents

Pipelines

Using the Stanford pipeline

Using multiple cores with the Stanford pipeline

Creating a pipeline to search text

Summary

Index

热门图书
推荐新闻
技术文库
论坛推荐