Hatena::ブログ(Diary)

Solr, Python, MacBook Air in Shinagawa Seaside RSSフィード

2016-09-06

[][][] BASIS TECHNOLOGY の Rosette と Kuromoji で すもももももももものうち やってみた メモ

#!/usr/local/jython/bin/jython
# -*- coding: utf-8 -*-

# Kuromoji と Rosette のパース対決

# Kuromoji
from com.atilika.kuromoji.unidic import Tokenizer

# BASIS TECHNOLOGY Rosette 
from com.basistech.util import Pathnames
from com.basistech.util import LanguageCode
from com.basistech.rlp import EnvironmentParameters
from com.basistech.rlp import RLPEnvironment
from com.basistech.rlp import ContextParameters
from com.basistech.rlp import TokenIteratorResultAccess
from com.basistech.rlp import ResultAccess
from com.basistech.rlp import TokenData

from java.io import File
import sys, os

if __name__ == "__main__":
 parseWord = u"すもももももももものうち"

 # Kuromoji ↓
 tokenizer = Tokenizer()
 tokens = tokenizer.tokenize( parseWord )

 print "\nKuromoji"
 for token in tokens:
  print token.getSurface() + "\t" + token.getAllFeatures()
 # Kuromoji ↑

 # BASIS TECHNOLOGY Rosette ↓
 # パラメータ設定 #
 btRoot = "/hoge/BasisTech"
 Pathnames.setBTRootDirectory( btRoot )
 envParams = EnvironmentParameters()
 environmentPath = btRoot + "/rlp/etc/rlp-environment.xml"
 envParams.setEnvironmentDefinition( File( environmentPath ) )
 rlpEnv = RLPEnvironment( envParams )
 rlpEnv.initialize()
 contextParam = ContextParameters()
 contextPath = btRoot + "/rlp/samples/etc/rlp-bl-context.xml"
 contextParam.setContextDefinition( File(contextPath) )
 rlpContext = rlpEnv.getContext(contextParam)
 rlpContext.setProperty("com.basistech.jsonw.skip", "true")

 # 形態素解析
 rlpContext.process(parseWord, LanguageCode.UNKNOWN)

 # 形態素解析結果の取り出し
 resultAccess = ResultAccess(rlpContext)
 tokenResultAccess = TokenIteratorResultAccess( resultAccess )
 tokenData = TokenData()

 print "\nBASIS TECHNOLOGY Rosette"
 while tokenResultAccess.next(tokenData):
  print tokenData.getText() +'\t',

  if tokenData.getPartOfSpeech():
   print tokenData.getPartOfSpeech(),

  if tokenData.getLemma():
   print tokenData.getLemma(),

  print
 # BASIS TECHNOLOGY Rosette ↑




Kuromoji
すもも  名詞,普通名詞,一般,*,*,*,スモモ,李,すもも,スモモ,すもも,スモモ,和,*,*,*,*
も      助詞,係助詞,*,*,*,*,モ,も,も,モ,も,モ,和,*,*,*,*
もも    名詞,普通名詞,一般,*,*,*,モモ,桃,もも,モモ,もも,モモ,和,*,*,*,*
も      助詞,係助詞,*,*,*,*,モ,も,も,モ,も,モ,和,*,*,*,*
もも    名詞,普通名詞,一般,*,*,*,モモ,桃,もも,モモ,もも,モモ,和,*,*,*,*
の      助詞,格助詞,*,*,*,*,ノ,の,の,ノ,の,ノ,和,*,*,*,*
うち    名詞,普通名詞,副詞可能,*,*,*,ウチ,内,うち,ウチ,うち,ウチ,和,*,*,*,*

BASIS TECHNOLOGY Rosette
すもも  NC
もも    NC
もも    NC
も      PL
もの    PL
うち    V うつ

2016-08-08

[] No results found になったときにやったこと メモ

画面右上の時計アイコンから、データの日時の範囲指定を変更する

デフォルトは Last 15 minutes になっているので

15分以上前のデータしかなければ 下記メッセージが出てグラフは表示されない

No results found 
Unfortunately I could not find any results matching your search. I tried really hard. I looked all over the place and frankly, I just couldn't find anything good. Help me, help you. Here are some ideas:

Expand your time range
I see you are looking at an index with a date field. It is possible your query does not match anything in the current time range, or that there is no data at all in the currently selected time range. Click the button below to open the time picker. For future reference you can open the time picker by clicking the time picker  in the top right corner of your screen.

Refine your query
The search bar at the top uses Elasticsearch's support for Lucene Query String syntax. Let's say we're searching web server logs that have been parsed into a few fields.

2016-08-01

2016-07-28

[] 自然言語処理環境を作った メモ

Javaバージョンアップ
$ sudo yum -y install java-1.8.0-openjdk-devel
$ sudo alternatives --config java <- 1.8を選択する

Jython インストール
$ wget http://search.maven.org/remotecontent?filepath=org/python/jython-installer/2.7.0/jython-installer-2.7.0.jar
$ mv "remotecontent?filepath=org%2Fpython%2Fjython-installer%2F2.7.0%2Fjython-installer-2.7.0.jar" /tmp/jython-installer-2.7.0.jar
$ sudo su -
#  java -jar /tmp/jython-installer-2.7.0.jar -c
Welcome to Jython !
You are about to install Jython version 2.7.0
(at any time, answer c to cancel the installation)
For the installation process, the following languages are available: English, German
Please select your language [E/g] >>>
Do you want to read the license agreement now ? [y/N] >>>
Do you accept the license agreement ? [Y/n] >>>
The following installation types are available:
  1. All (everything, including sources)
  2. Standard (core, library modules, demos and examples, documentation)
  3. Minimum (core)
  9. Standalone (a single, executable .jar)
Please select the installation type [ 1 /2/3/9] >>> 2
Do you want to install additional parts ? [y/N] >>>
Do you want to exclude parts from the installation ? [y/N] >>>
Please enter the target directory >>> /usr/local/jython
Unable to find directory /usr/local/jython, create it ? [Y/n] >>>
Your java version to start Jython is: Oracle Corporation / 1.8.0_101
Your operating system version is: Linux / 4.4.11-23.53.amzn1.x86_64
Summary:
  - mod: true
  - demo: true
  - doc: true
  - src: false
  - ensurepip: true
  - JRE: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.24.amzn1.x86_64/jre
Please confirm copying of files to directory /usr/local/jython [Y/n] >>>
 10 %
 20 %
 30 %
 40 %
 50 %
 60 %
 70 %
Generating start scripts ...
Installing pip and setuptools
 90 %
Ignoring indexes: https://pypi.python.org/simple/
Downloading/unpacking setuptools
Downloading/unpacking pip
Installing collected packages: setuptools, pip
Successfully installed setuptools pip
Cleaning up...
 100 %
Do you want to show the contents of README ? [y/N] >>>
Congratulations! You successfully installed Jython 2.7.0 to directory /usr/local/jython.

$ sudo chmod -R 777 /usr/local/jython/cachedir

make環境インストール
$ sudo yum -y install gcc-c++ glibc-headers openssl-devel readline libyaml-devel readline-devel zlib zlib-devel libffi-devel libxml2 libxslt libxml2-devel libxslt-devel sqlite-devel

rzsz
$ wget http://ohse.de/uwe/releases/lrzsz-0.12.20.tar.gz
$ tar -xzvf lrzsz-0.12.20.tar.gz
$ cd lrzsz-0.12.20
$ ./configure --prefix=/usr/local
$ make
$ sudo su -
# make install
# cd /usr/local/bin
# ln -s lrz rz
# ln -s lsz sz

elasticsearch-py
$ sudo pip install elasticsearch
$ python
>>> from elasticsearch import Elasticsearch

$ sudo /usr/local/jython/bin/pip install elasticsearch
$ jython
>>> from elasticsearch import Elasticsearch

ICU
$ wget http://download.icu-project.org/files/icu4j/55.1/icu4j-55_1.jar
$ vi ~/.bash_profile
    export CLASSPATH=$CLASSPATH:/hoge/icu4j-55_1.jar


mvn
$ sudo wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repo
$ sudo sed -i s/\$releasever/6/g /etc/yum.repos.d/epel-apache-maven.repo
$ sudo yum install -y apache-maven
$ mvn --version

mecab
$ wget http://mecab.googlecode.com/files/mecab-0.996.tar.gz
$ tar zvxf mecab-0.996.tar.gz
$ cd mecab-0.996
$ ./configure
$ make
$ sudo make install

mecab-python
$ sudo su -
# pip install https://mecab.googlecode.com/files/mecab-python-0.996.tar.gz
$ python
    import MeCab

UniDic
$ wget https://osdn.jp/projects/unidic/downloads/58338/unidic-mecab-2.1.2_src.zip
$ unzip unidic-mecab-2.1.2_src.zip
$ cd unidic-mecab-2.1.2_src
$ ./configure
$ make
$ sudo make install
$ sudo ldconfig
$ sudo vi /usr/local/etc/mecabrc 
    ; dicdir =  /usr/local/lib/mecab/dic/ipadic
    dicdir =  /usr/local/lib/mecab/dic/unidic
$ mecab
すもももももももものうち
すもも  スモモモモ名詞-普通名詞-一般
も      モ      モ      も      助詞-係助詞
もも    モモ    モモ名詞-普通名詞-一般
も      モ      モ      も      助詞-係助詞
もも    モモ    モモ名詞-普通名詞-一般
の      ノ      ノ      の      助詞-格助詞
うち    ウチ    ウチ    内      名詞-普通名詞-副詞可能
EOS

kuromoji 
$ sudo yum install git
$ cd ~/work
$ git clone https://github.com/atilika/kuromoji.git
$ cd kuromoji
$ mvn -pl kuromoji-unidic -am package <- メモリサイズ小さいと失敗する
$ vi ~/.bash_profile
CLASSPATH=$CLASSPATH:kuromoji-unidic/target/kuromoji-unidic-1.0-SNAPSHOT.jar:kuromoji-core/target/kuromoji-core-1.0-SNAPSHOT.jar

2016-07-27

[][] の バージョンアップ メモ

Jython 付属の pip で エラーになったので OpenJDK の バージョンを上げた

$ sudo yum -y install java-1.8.0-openjdk-devel
$ sudo alternatives --config java <- 1.8を選択する