How to serialize log data
atifshaikh4514
MemberPosts:4Contributor I
I have an audit log data where each tuple represents an event associated with a particular user id and a list of other attributes including both nominal and numerical. What is the best way to transform the data as a set of user web clicks using rapidminer?
Off the top of my head, I can think of quantifiynig all attributes and seralizing it. But my main concern is how to deal with variable length click sequences?
我需要创建我们er click profiles as an end result.
Off the top of my head, I can think of quantifiynig all attributes and seralizing it. But my main concern is how to deal with variable length click sequences?
我需要创建我们er click profiles as an end result.
0
Answers
I am not sure if this is directly possible with standard operators (I think there is a Audit / Log file input operator in the Text plugin but I would have to checkout this myself...) If this does not help, maybe you would have to code your own operator for this. But maybe someone else knows a better solution possible with existing operators.
You could of course determine the maximum number of possible events in a sequence and build attributes for the maximum number and set the attributes of shorter sequences to missing values. But I could ask a former colleague who works on sequence mining with RapidMiner how he represents this.
Cheers,
Ingo
I came to know of some techniques from relational mining where a similar reverse pivoting is used but instead of representing clickstreams as it is, their summaries r saved instead but that doesnot seem applicable for my problem.
still scratching my head....
schones wochenende.
Atif.
as far as I know the operator expects Apache log files but I could be mistaken. So maybe developing your own input operator is the only option right now, sorry.
Cheers,
Ingo
But maybe you can be a bit more specific about what you intend to do with your data...
Regards,
Christian