Integrating different interval time series data and accommodating for lead/lag time
嗨,新用户Rapidminer。只有一个few days and have already learned more through following the tutorials and trial and error than weeks of trying to do the same with Python, big thanks to Thomas Ott for the series on building AI market models, incredibly helpful. The problem I'm facing is from a lack of knowledge and I thought the easiest way to gain that knowledge is to ask.
I'm trying to build a process which forecasts future foreign exchange values, a pointless endeavour maybe, but it's fun. The time series data I'm working with hasdifferent time intervals - such as end of day data and end of month data. Is there a standard way to put these together? I'm wary of going overkill on the details here. The range would be the same, i.e., Jan 2010 - present, regardless of the periodicity.
Second thing (and I don't know if it is a thing) is that the economic indicators I'm looking at affect (if at all) my label, (i.e., monthly closing price of the eur/usd) on different time-scales. Some are leading, others coincident and lagging. Do I need to tell my process that a certain leading indicator isn't likely to affect a given price for 2 or 3 months? Or that a moving average is a reflection of events that have already passed?
Release dates vs periods covered is also confusing the heck out of me, for example OECD release certain reports roughly 6 weeks after the start of the month they cover (or 2 weeks after it ends if it's easier) so currently data for Feb 17 is out, March's data wont be released until mid-way through April. Are there any steps I need to take creating a process to accomodate for these factors?
Thanks in advance and if I haven't explained something clearly or more details are needed let me know,
Alex
Best Answer
-
Thomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn
Hi Alex,
Did you try the Fin and Econ extension? There's a way to do rebasing in that extension. If you haven't already, make sure to download and install the Series extension.
Of course you can always roll up the times into one period, but it will probably be a few operators to do so.
With respect to leading, coincident, and lagging indicators. These are incredibly tricky to apply in practice. I'm usually of the opinion that if a lagging indicator is posted publicly that's lagged 6 months, I put it into the model on the day it was released. From there you can try to forecast the lagging indicator with a process like the one below (you'll have to tweak this process).
The better application is finding out which one, or collection of economic indicators, work better. For that you can do something like Feature Selection, which is another process altogether.
horizon 5 symbol XOM start_date 2016-01-01 end_date 2017-03-21
<参数键= " test_window_width " value = " 20 " / >
<参数键= value =“main_criterion prediction_trend_accuracy"/>
<运营商激活= " true " class = "系列:窗口" compatibility="7.4.000" expanded="true" height="82" name="Windowing" width="90" x="380" y="34">
<运营商激活= " true " class = "系列:窗口" compatibility="7.4.000" expanded="true" height="82" name="Windowing (2)" width="90" x="380" y="136">
<参数键= value =“main_criterion prediction_trend_accuracy"/>0