"Series by examples, differentiate, break by ID?"

mafern76mafern76 MemberPosts:45Contributor II
edited June 2019 inHelp
Hi! I have the following data...

id time att0
1 1 5
2 1 8
2 2 9
3 2 5
3 4 4
3 5 2
4 5 6
4 6 5
4 8 8
4 10 5

Different id's have different amount of recorded instances, some have one, some two... some maybe 10...

How can I get from that data, to this:

id time att0 diff_att0
1 1 5 0
2 1 8 0
2 2 9 1
3 2 5 0
3 4 4 -1
3 5 2 -2
4 5 6 0
4 5 6 1
4 8 8 3
4 10 5 -3

Using the DIFFERENTIATE node in the Series Extension is possible to do so but disregarding the ID. Is there a way to break the DIFFERENTIATE by ID? Or another way to achieve this same result?

Thanks a lot!!

Best regards.

PD: the final idea is to then also aggregate the diff_att0 to get further information on how att0 moved through time.








Tagged:

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,388RM Data Scientist
    Hi

    I think you need to pivot the table first. Then you get something like

    id att0_time1 att0_time2....
    1 5 ?
    2 1 2
    and you can easily work on this.


    Cheers,

    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • mafern76mafern76 MemberPosts:45Contributor II
    Hi Martin, thank you very much for your answer, I thought about working horizontally as well, it provides the possibility for more detail regarding series progression...

    I haven't discovered yet though how can I automatically process any amount of attributes using macros.

    For example, if I generate (att0_time1 - att0_time2)... is there a way to %{macro}_time1 - %{macro}_time2 to generate for all attributes?

    Thanks a lot.

    PD: more about my data, it consists of various performance measures taken at different times, so some cases have 1 measurements, some 2, 3, and so on, actually, not on regular intervals.

    So the idea would be to be able to get a sense of performance progression and not just a general avg/min/max/sd aggregation...
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,388RM Data Scientist
    Hi Mafern.

    You might have a look at Gernate Functionset. If you have a table like

    att_time1 att_time2 ...

    it generates for example the difference/sum/product between all of them. Might be what you want.

    If you just want to have att_timeX- att_timeX+1 you might need to work with a loop (either usual, values or attributes).

    Cheers,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • mafern76mafern76 MemberPosts:45Contributor II
    Thank you Martin!

    It's not Generate Function Set I need, just precise calculations like you said, with a loop.

    I'm using generate %{loop_attribute} - %{loop_attribute}_1 for example, this works, but my time intervals are arbitrary, so I need to generate a time instance index. Any idea on how to get it? I couldn't work it out on my own, thanks a lot.

    id time time_instance
    1 4 1
    1 8 2
    1 20 3
    2 3 1
    2 4 2
    2 80 3
    2 120 4




  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM ModeratorPosts:2,959Community Manager
    Hi...have you tried looping by value (ID) and then setting the data with the filtered set? I do this all the time. It's less efficient than pivoting but sometimes cleaner.

    Scott
  • mafern76mafern76 MemberPosts:45Contributor II
    谢谢你的年代genzer I made it work this way:












    <运营商激活= " true " class = " filter_examples”compatibility="5.3.015" expanded="true" height="76" name="Filter Examples" width="90" x="112" y="30">

















    Then simply appending the collection result.

    So now I was able to horizontally get my data, but I'm having a bug using loop attribute to get differences and ratios through timestamps.

    I posted the issue at problems and support:

    http://rapid-i.com/rapidforum/index.php/topic,8677.0.html
  • mafern76mafern76 MemberPosts:45Contributor II
    sgenzer wrote:

    Hi...have you tried looping by value (ID) and then setting the data with the filtered set? I do this all the time. It's less efficient than pivoting but sometimes cleaner.

    Scott
    Scott, I think you actually meant to apply differentiate inside the loop I posted before.

    That makes it work for me, I don't need to work horizontally anymore I guess.

    Thanks a bunch!
  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM ModeratorPosts:2,959Community Manager
    yes sorry that sounds right. Glad you made it work!

    Scott
Sign InorRegisterto comment.