R Conqueror: 연설문 분석해서 wordcloud 생성하기 I

 

## hong.txt에 저장된 연설문을 분석하여 언급된 단어를 기준으로 워드 클라우드 생성

## 1. 분석용 데이터를 읽어옴

## 2. 불필요한 제거 삭제

## 3. 필요한 단어 추가

## 4. 파일로 저장

## 5. table 형식으로 변환해서 변수 읽어옴

## 6. wordcloud 출력

## 7. save image

##

##

##



getwd() # check working directory

setwd("C:\\Users\\user\\Desktop\\R까기") ## change working directory



# 0. packages load



library(KoNLP)

library(wordcloud)

library(RColorBrewer)



# 1. read data



 txt = readLines("data/Part_1/LEVEL_1/hong.txt") ## read all text lines

 head(txt)



# 2. edit data (delete & add)



         txt = gsub("7","",txt)

  

# 3. extract nouns

         nouns = sapply(txt, extractNoun, USE.NAMES=F)

         nouns_unlist = unlist(nouns)

         head(nouns_unlist,30)



# 3-1. 2글자 이상만 저장

        txt_2 = Filter(function(x) {

              

                nchar(x) >=2

              

              

        },nouns_unlist)

      

        txt_2

        head(txt_2, 30)





# 4. save data

        write(txt_2, "hong_2.txt")





# 5. read data as table

        rev = read.table("hong_2.txt")

        rev

        nrow(rev) # data 행수 확인

      

# 6. wordcloud

        table(rev)

        wordcount = table(rev)

        head(sort(wordcount, decreasing=T),30) # 가장 많이 노출된 단어 확인





# 7. 그래픽 출력

        windows() # 윈도우형태로 출력하기 위해 함수 호출 ; windows()함수 없으면 savePlot() 에서 에러 발생

        palete = brewer.pal(9,"Set1")

        wordcloud(names(wordcount), freq=wordcount, scale=c(5,.5), rot.per= .25, min.freq=1, random.order=F, random.color = T, colors=palete)
결과)
첨부파일
R Conqueror

2014년 11월 19일 수요일

연설문 분석해서 wordcloud 생성하기 I

댓글 없음 :

댓글 쓰기