Excerpt |
---|
Helpful Scripts, Commands, etc. for Working with Hadoop Configurations |
...
. |
Setting the current shim for all your pentaho applications with 1 command.
This linux shell command finds all plugin.properties files from the current directory and lower and sets active.hadoop.configuration to the provided argument
...
. The argument is the name of the hadoop configuration to use.
Code Block |
---|
find . -wholename "*pentaho-big-data-plugin/plugin.properties" -exec sed -i "s/\(active.hadoop.configuration=\)\(.*\)/\1$1/g" {} \; |
...
Printing the
...
current shim configured for all your pentaho applications with 1 command.
This linux shell command finds all plugin.properties
...
files from the current directory and lower and prints the value of active.
...
hadoop.configuration
...
Code Block |
---|
find . -wholename "*pentaho-big-data-plugin/plugin.properties" -exec ls {} \; -exec grep -o "active\.hadoop\.configuration=[0-9A-Za-z\-]*" {} \; | cut -f2 -d= |
Alias to find the shims directory and change your current directory to it:
Code Block |
---|
alias goshims="cd \`find . -wholename \"*pentaho-big-data-plugin/hadoop-configurations\"\`" |
Alias to find the active shim's directory and change your current directory to it:
Code Block |
---|
alias goshim='SHIM_PROP="`find . -wholename \"*pentaho-big-data-plugin/plugin.properties\" -exec ls {} 2>/dev/null \; | head --lines=1`"; \
ACTIVE_SHIM="`grep -o \"active\.hadoop\.configuration=[0-9A-Za-z\-]*\" \"$SHIM_PROP\" | sed \"s/.*=//g\"`"; \
HADOOP_CONFIG_PATH="`grep -o \"hadoop\.configurations\.path=[0-9A-Za-z\-]*\" \"$SHIM_PROP\" | sed \"s/.*=//g\"`"; \
cd "`dirname \"$SHIM_PROP\"`/$HADOOP_CONFIG_PATH/$ACTIVE_SHIM"'
|
Change the ResourceManager information from Hortonworks Sandbox to my.resourcemanager.com (replace this with your RM's hostname):
Code Block |
---|
find . -name "*-site.xml" -exec sed -i "s/sandbox.hortonworks.com/my.resourcemanager.com/g" {} \;
|
Change the ResourceManager information from Cloudera QuickStart VM to my.resourcemanager.com (replace this with your RM's hostname):
Code Block |
---|
find . -name "*-site.xml" -exec sed -i "s/clouderamanager.cdh5.test/my.resourcemanager.com/g" {} \; |
Increase the Mondrian query timeout:
Analyzer reports using Hive datasources (or anything that generates a MapReduce job) can exceed the timeout, here's how to change the timeout quickly from 300 seconds (5 mins) to 600 seconds (10 mins), you can change the 600 to whatever you want:
Code Block |
---|
sed -i "s/^mondrian.rolap.queryTimeout=300$/mondrian.rolap.queryTimeout=600/" pentaho-solutions/system/mondrian/mondrian.properties
|