R takes forever to compute a simple procedure
allWords is a vector of 1.3 million words, with some repetition. What I
want to do, is to create two vectors:
A with the word
B with the occurance of the word
So that I can later join them in a Matrix and thus associate them, like:
"mom", 3 ; "pencil", 14 etc.
for(word in allWords){
#get a vector with indexes for all repetitions of a word
temp <- which(allWords==word)
#Make "allWords" smaller - remove duplicates
allWords= allWords[-which(allWords==word)]
#Calculate occurance
occ<-length(temp)
#store
A = c(A,word)
B = c(B,occ)
}
This for loop takes forever and I don't really know why or what I am doing
wrong. Reading the 1.3 million words from a file goes as fast as 5
seconds, but performing these basic operations never lets the algorithm
terminate.
No comments:
Post a Comment