IngoRMAdministrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University ProfessorPosts:1,751RM Founder
Hi,
Do you mean just getting rid of the symbols "@ and#" or do you also want to remove what is following after, e.g. "@ingomierswa"and "#datascience" should be completely removed?
Both is easily possible with the operator "Replace" and a simple regular expression. Below is a small sample process showing you how this is done.
Hi@IngoRM. This worked thank you, but I'm left with characters other than letters. So this clears up letters after the # but not other characters. For example, I had@g_smugand it only removed@gand stopped at the underscore. Any suggestions?
It looks a bit ugly but basically means find anything 'word' that starts with either @ or #, and select everything till the next space, dot or comma. You replace this with nothing and it's gone.
Answers
Hi,
Do you mean just getting rid of the symbols "@ and#" or do you also want to remove what is following after, e.g. "@ingomierswa"and "#datascience" should be completely removed?
Both is easily possible with the operator "Replace" and a simple regular expression. Below is a small sample process showing you how this is done.
Hope this helps,
Ingo
Thanks
Extend your regex a bit like this :
\b(@|#)[^\. \s, ]+
It looks a bit ugly but basically means find anything 'word' that starts with either @ or #, and select everything till the next space, dot or comma. You replace this with nothing and it's gone.