"how to develop a new algorithm in RapidMiner?"
I have an idea of a new algorithm I want to develop it and tested in RapidMiner. Should I use the extension template provided by RapidMiner or there is another way?
Tagged:
1
Best Answers
-
rfuentealba Moderator, RapidMiner Certified Analyst, Member, University ProfessorPosts:568UnicornHello,@Obaeissa, welcome to the community.
RapidMiner扩展模板提供给you so that you don't have to connect with RapidMiner and import stuff from there. If you are proficient in Java, it is the most recommended way to implement your algorithm. You can also use the Apache Groovy programming language to implement it and run it as "Execute Script". However, I haven't seen much documentation about this (perhaps my good friends@mschmitz,@David_Aand@landcan give you some more tricks. Perhaps@IngoRMtoo).
If your idea of an algorithm is something you are trying for the first time, I would recommend to create a Python (or whatever language you feel comfortable with) implementation first, and then build a RapidMiner operator (or superoperator) based on that. At least that is what I did when I "invented" the Naïve Bayes algorithm (Yes, I did it almost 200 years after Sir Thomas Bayes, but I didn't know it until I saw my first data science books, so... sorry). If you go this route, make sure you use the Anaconda Python distribution and the Python Scripting extension, so it can be easier to test it through RapidMiner.
BTW, write a paper about your algorithm. It is important to keep things as scientific as possible, not because it is a RapidMiner requirement but because data scientists like academic processes. Yes, you will hear@yyhuangsaying that "a lot of academic data scientists haven't seen problems in real life", but creating an algorithm (rather than making use of it) is a totally different matter.
Hope this helps,
Rodrigo.
9 -
IngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University ProfessorPosts:1,751RM FounderTo add to Rodrigo's comment: I woulddefinitelyrecommend to always work with the language you are already comfortable with. If you know Java, there is simply no point in learning Python first but going straight to building a Java extension is most likely the simplest way for you. But if you already know R or Python or even have an implementation there already, the first thing should always be to integrate those first. Just like Rodrigo has said.
So let's assume you in factdoknow Java and want to go down the extension route. Then please use this documentation here:If you are familiar with Java, Git, Gradle and you favorite IDE (IntelliJ, Eclipse) already, you should be able to be up and running in less than an hour...
On the freelancing: while I would certainly be able to code this for you, I have some doubts that you would be willing to pay my daily rate for that- so I hope that somebody else would step in here to help out in case you need it.
Hope this helps,Ingo
8
Answers
the first step is to create a process and use the Python Scripting extension to solve your problem.
When that process is working and you got the inputs, outputs and parameters right, you can use the Custom Operators extension to transform the process into an operator.
Custom Operators:https://marketplace.www.kenlockard.com/UpdateServer/faces/product_details.xhtml?productId=rmx_process_defined_operators
Tutorial:https://community.www.kenlockard.com/discussion/56872/tutorial-for-creating-custom-operators
After building the custom operator (one or many), you create the custom extension. It will be a normal RapidMiner extension (in your case depending on the Python Scripting extension), and you can put it on Server, give it to other people and even publish it on the Marketplace if it is helpful for others.
Regards,
Balázs