Solr, Python, MacBook Air in Shinagawa Seaside RSSフィード


[][][] BASIS TECHNOLOGY の Rosette と Kuromoji で すもももももももものうち やってみた メモ

# -*- coding: utf-8 -*-

# Kuromoji と Rosette のパース対決

# Kuromoji
from com.atilika.kuromoji.unidic import Tokenizer

from com.basistech.util import Pathnames
from com.basistech.util import LanguageCode
from com.basistech.rlp import EnvironmentParameters
from com.basistech.rlp import RLPEnvironment
from com.basistech.rlp import ContextParameters
from com.basistech.rlp import TokenIteratorResultAccess
from com.basistech.rlp import ResultAccess
from com.basistech.rlp import TokenData

from java.io import File
import sys, os

if __name__ == "__main__":
 parseWord = u"すもももももももものうち"

 # Kuromoji ↓
 tokenizer = Tokenizer()
 tokens = tokenizer.tokenize( parseWord )

 print "\nKuromoji"
 for token in tokens:
  print token.getSurface() + "\t" + token.getAllFeatures()
 # Kuromoji ↑

 # パラメータ設定 #
 btRoot = "/hoge/BasisTech"
 Pathnames.setBTRootDirectory( btRoot )
 envParams = EnvironmentParameters()
 environmentPath = btRoot + "/rlp/etc/rlp-environment.xml"
 envParams.setEnvironmentDefinition( File( environmentPath ) )
 rlpEnv = RLPEnvironment( envParams )
 contextParam = ContextParameters()
 contextPath = btRoot + "/rlp/samples/etc/rlp-bl-context.xml"
 contextParam.setContextDefinition( File(contextPath) )
 rlpContext = rlpEnv.getContext(contextParam)
 rlpContext.setProperty("com.basistech.jsonw.skip", "true")

 # 形態素解析
 rlpContext.process(parseWord, LanguageCode.UNKNOWN)

 # 形態素解析結果の取り出し
 resultAccess = ResultAccess(rlpContext)
 tokenResultAccess = TokenIteratorResultAccess( resultAccess )
 tokenData = TokenData()

 print "\nBASIS TECHNOLOGY Rosette"
 while tokenResultAccess.next(tokenData):
  print tokenData.getText() +'\t',

  if tokenData.getPartOfSpeech():
   print tokenData.getPartOfSpeech(),

  if tokenData.getLemma():
   print tokenData.getLemma(),


すもも  名詞,普通名詞,一般,*,*,*,スモモ,李,すもも,スモモ,すもも,スモモ,和,*,*,*,*
も      助詞,係助詞,*,*,*,*,モ,も,も,モ,も,モ,和,*,*,*,*
もも    名詞,普通名詞,一般,*,*,*,モモ,桃,もも,モモ,もも,モモ,和,*,*,*,*
も      助詞,係助詞,*,*,*,*,モ,も,も,モ,も,モ,和,*,*,*,*
もも    名詞,普通名詞,一般,*,*,*,モモ,桃,もも,モモ,もも,モモ,和,*,*,*,*
の      助詞,格助詞,*,*,*,*,ノ,の,の,ノ,の,ノ,和,*,*,*,*
うち    名詞,普通名詞,副詞可能,*,*,*,ウチ,内,うち,ウチ,うち,ウチ,和,*,*,*,*

すもも  NC
もも    NC
もも    NC
も      PL
もの    PL
うち    V うつ


[] No results found になったときにやったこと メモ


デフォルトは Last 15 minutes になっているので

15分以上前のデータしかなければ 下記メッセージが出てグラフは表示されない

No results found 
Unfortunately I could not find any results matching your search. I tried really hard. I looked all over the place and frankly, I just couldn't find anything good. Help me, help you. Here are some ideas:

Expand your time range
I see you are looking at an index with a date field. It is possible your query does not match anything in the current time range, or that there is no data at all in the currently selected time range. Click the button below to open the time picker. For future reference you can open the time picker by clicking the time picker  in the top right corner of your screen.

Refine your query
The search bar at the top uses Elasticsearch's support for Lucene Query String syntax. Let's say we're searching web server logs that have been parsed into a few fields.



[] 自然言語処理環境を作った メモ

$ sudo yum -y install java-1.8.0-openjdk-devel
$ sudo alternatives --config java <- 1.8を選択する

Jython インストール
$ wget http://search.maven.org/remotecontent?filepath=org/python/jython-installer/2.7.0/jython-installer-2.7.0.jar
$ mv "remotecontent?filepath=org%2Fpython%2Fjython-installer%2F2.7.0%2Fjython-installer-2.7.0.jar" /tmp/jython-installer-2.7.0.jar
$ sudo su -
#  java -jar /tmp/jython-installer-2.7.0.jar -c
Welcome to Jython !
You are about to install Jython version 2.7.0
(at any time, answer c to cancel the installation)
For the installation process, the following languages are available: English, German
Please select your language [E/g] >>>
Do you want to read the license agreement now ? [y/N] >>>
Do you accept the license agreement ? [Y/n] >>>
The following installation types are available:
  1. All (everything, including sources)
  2. Standard (core, library modules, demos and examples, documentation)
  3. Minimum (core)
  9. Standalone (a single, executable .jar)
Please select the installation type [ 1 /2/3/9] >>> 2
Do you want to install additional parts ? [y/N] >>>
Do you want to exclude parts from the installation ? [y/N] >>>
Please enter the target directory >>> /usr/local/jython
Unable to find directory /usr/local/jython, create it ? [Y/n] >>>
Your java version to start Jython is: Oracle Corporation / 1.8.0_101
Your operating system version is: Linux / 4.4.11-23.53.amzn1.x86_64
  - mod: true
  - demo: true
  - doc: true
  - src: false
  - ensurepip: true
  - JRE: /usr/lib/jvm/java-1.8.0-openjdk-
Please confirm copying of files to directory /usr/local/jython [Y/n] >>>
 10 %
 20 %
 30 %
 40 %
 50 %
 60 %
 70 %
Generating start scripts ...
Installing pip and setuptools
 90 %
Ignoring indexes: https://pypi.python.org/simple/
Downloading/unpacking setuptools
Downloading/unpacking pip
Installing collected packages: setuptools, pip
Successfully installed setuptools pip
Cleaning up...
 100 %
Do you want to show the contents of README ? [y/N] >>>
Congratulations! You successfully installed Jython 2.7.0 to directory /usr/local/jython.

$ sudo chmod -R 777 /usr/local/jython/cachedir

$ sudo yum -y install gcc-c++ glibc-headers openssl-devel readline libyaml-devel readline-devel zlib zlib-devel libffi-devel libxml2 libxslt libxml2-devel libxslt-devel sqlite-devel

$ wget http://ohse.de/uwe/releases/lrzsz-0.12.20.tar.gz
$ tar -xzvf lrzsz-0.12.20.tar.gz
$ cd lrzsz-0.12.20
$ ./configure --prefix=/usr/local
$ make
$ sudo su -
# make install
# cd /usr/local/bin
# ln -s lrz rz
# ln -s lsz sz

$ sudo pip install elasticsearch
$ python
>>> from elasticsearch import Elasticsearch

$ sudo /usr/local/jython/bin/pip install elasticsearch
$ jython
>>> from elasticsearch import Elasticsearch

$ wget http://download.icu-project.org/files/icu4j/55.1/icu4j-55_1.jar
$ vi ~/.bash_profile
    export CLASSPATH=$CLASSPATH:/hoge/icu4j-55_1.jar

$ sudo wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repo
$ sudo sed -i s/\$releasever/6/g /etc/yum.repos.d/epel-apache-maven.repo
$ sudo yum install -y apache-maven
$ mvn --version

$ wget http://mecab.googlecode.com/files/mecab-0.996.tar.gz
$ tar zvxf mecab-0.996.tar.gz
$ cd mecab-0.996
$ ./configure
$ make
$ sudo make install

$ sudo su -
# pip install https://mecab.googlecode.com/files/mecab-python-0.996.tar.gz
$ python
    import MeCab

$ wget https://osdn.jp/projects/unidic/downloads/58338/unidic-mecab-2.1.2_src.zip
$ unzip unidic-mecab-2.1.2_src.zip
$ cd unidic-mecab-2.1.2_src
$ ./configure
$ make
$ sudo make install
$ sudo ldconfig
$ sudo vi /usr/local/etc/mecabrc 
    ; dicdir =  /usr/local/lib/mecab/dic/ipadic
    dicdir =  /usr/local/lib/mecab/dic/unidic
$ mecab
すもも  スモモモモ名詞-普通名詞-一般
も      モ      モ      も      助詞-係助詞
もも    モモ    モモ名詞-普通名詞-一般
も      モ      モ      も      助詞-係助詞
もも    モモ    モモ名詞-普通名詞-一般
の      ノ      ノ      の      助詞-格助詞
うち    ウチ    ウチ    内      名詞-普通名詞-副詞可能

$ sudo yum install git
$ cd ~/work
$ git clone https://github.com/atilika/kuromoji.git
$ cd kuromoji
$ mvn -pl kuromoji-unidic -am package <- メモリサイズ小さいと失敗する
$ vi ~/.bash_profile


[][] の バージョンアップ メモ

Jython 付属の pip で エラーになったので OpenJDK の バージョンを上げた

$ sudo yum -y install java-1.8.0-openjdk-devel
$ sudo alternatives --config java <- 1.8を選択する