Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 17 Next »

Overview

ACTION ITEM: Get presentation slides from attendees; post to Meetup wiki page.

Jens: ActiveX to Java; SAP Connector Overview

Rob (Red Dolphin): XML-SAX Input Performance

  • Benchmarking using XML with Pentaho; case study, issues, findings, results
  • Case:
    • Jointly used web based application
    • Used by Dutch Social Security organizations (42,000 users, ~3,000,000 requests/mo)
    • Solution:
      • PDI 2.5 and MySQL 5
      • Log file => ETL layer => Data Warehouse
      • 2.5 Gb log file daily; XML
    • Issues:
      • XML-SAX input caused 'OEM' errors (reading 2.5 Gb into memory)
      • XML attributes were not in same XML hierarchy level
      • XML Input Path plugin hada low performance when reading large files
    • Input Performance:
      • As input file grows, time to process (input step) grows by magnitudes
        256mb, ~8 min
        500mb, ~50 min
        1000mb, broke off after 2hrs

ACTION ITEM: Issues came up regarding the encoding of the XML files - we need to be sure their is a JIRA case for validating or ensuring correct processing of double byte or extended character sets when natively entered into the XML file. Also, action sequence editor also has issues handling XML files that include extended characters - the server will process the action sequence, it is valid XML, but action sequence editor will reject the file as invalid.

Luc: Scrum and Agile

  • Maximizing business output; agile, but produce a usable set of results iteratively.
  • Talking about structuring projects and code such that you can incorporate changes to the Pentaho codeline immediately, from a separate, isolated project. Isolates your needed changes before the code is merged into the Pentaho codeline.

MINGLE - Manages SCRUM projects well.

  • Suggests you synchronize your sprints with Pentaho sprints; saves work, makes sprinting more efficient.
  • Trust Pentaho community, Pentaho teams
  • Suggests that scrumming requires an awareness of the platform details that will benefit you.

ACTION ITEM: Good idea to fly Luc to Orlando to have him work with our sprint\build processes for outside input.

  • Define business value - give effort weight to stories. Weight new Pentaho features for your project!
  • Have to have confidence in, and accept, the Pentaho developers coding styles.
  • Largest pain point: building distributions. Should be helped with dependency management, repackaging in 2.0.
  • Maven, Ivy well received
  • ACTION ITEM: Request for better communication from sprint standups - suggestion to webcam the standups. Really want major decision points from technical teams.
  • ACTION ITEM: Communication on version releases is definitely not clear.. need more real time info on release direction. New branches showing up need to be explained, so community knows what they are for, what value they bring.
  • Remote sprinting, scrumming is a partial process at best.

Giovanni: ETL Case Studies

  • Periodic reporting for regional healthcare department; KPIs, analysis
  • ETL - SAP inputs; ~ 1,000,000 records
  • Pharmaceutical reporting
  • ETL - SAP inputs; - process, reporting in MS Access
  • Telecom ETL
  • No labels