Getting wordlist for each separate file in the Process Documents operator

b00122599b00122599 MemberPosts:26Contributor II
edited June 2020 inHelp
Hey folks,

I am trying to get the top words from text files using the process in the XML below to get the top words in text files. However I wish to get the top words for each text file in a folder seperately the operation below gives me the results for the whole collection of text files. Is there anyway to get the operation to process them individually rather than as a group?

Thanks in advance,

Neil.

< portSpacing端口= " sink_result 2”间隔= " 0 " / >

Best Answer

Answers

  • jmphillipsjmphillips MemberPosts:18Contributor II
    Hello, This could help you.

    ;)
    Rapid.rmp 6.9K
  • b00122599b00122599 MemberPosts:26Contributor II
    Hello,

    Sorry for the delay in my reply and thanks for the help. I am now looping through the text files with the process above successfuly but the wordlist is empty for all files in my output list

    Thanks again,

    Neil.
  • kaymankayman MemberPosts:662Unicorn
    At first glance nothing seems wrong with your process logic, so what if you tune the parameters (more precise the pruning ones) a bit?
    Or better, try first without any pruning and bypass filter tokens etc to ensure you don't loose your content in these step. And if you do you know at least why you get no results.

    Also, just as a sidenote, given that you are only looking for your wordlist you can untick the 'vector creation' in your process documents operator. you don't need it so it will speed up things a bit.
  • b00122599b00122599 MemberPosts:26Contributor II
    Thanks very much that worked. Now a new problem. I have a lot of example sets just have the name Exmaple Set so I can't tell which results belong to which text file. Would you happen to have any pointers of how to add the text file name to the output instead of example set? Thanks again.

    Neil.
  • b00122599b00122599 MemberPosts:26Contributor II
    再次感谢所有帮助的美联社preciated!
    kayman
Sign InorRegisterto comment.