IDempiere workshop 2015/transcript
First day
Agenda
We looked into the agenda at IDempiere_workshop_2015 and talked about additional things:
- Commercial
- sales stoppers by areas
- iDempiere distros
- iDempiere light
- Forums - functional - or open an iDempiere-technical forum
- Collect information about implementation
- plugin to collect anonymous information
- Ux
- Usability of webstore - embedding into iDempiere
- JSF?
- Usability of iDempiere - UI - User experience
- Usability of webstore - embedding into iDempiere
- Functional
- Asset Accounting
- surcharges and discounts conditions
- pricing matrix
- global vs. specific price list
- BOM alternatives/optionals logic
- External notifications SMS, mailers, etc
- EMail-Marketing
- Order document type hardcoded business logic
- Workflows
- Quick Info
- LCO Global taxation
- Alerts
- System configurator organization
- Vision about manufacturing
- Average costing
- Reactivating invoices, payments, bank
- Dictionary
- Adding fields to sales orders - issues about positioning
- Every record must have a unique key
- Yes, about 20 tables have no key; most important is product price. Others are access tables. We do not need it for storage, translations. We can create a JIRA ticket and contribute a patch
- Technical
- OSGi in general
- Complex automated testing - jmeter, fitnesse, selenium
- RESTful webservices - token
- Dependencies of jasper libraries - jasper core cannot be replaced
- Mobile UI - Synchronize
- extending
- Best practices for commnunity plugins
- Events generic
- Replication
- Performance
- AWS
- Database scaling
- Deployment Process - best practices
- Memory consumption
- opening multiple windows
- Periodical restart
each record should have a primary key
Each Record should have a primary key: Primary keys are needed to audit changes in the database. That is needed for pricing. Chuck wants to create a JIRA ticket for that (JIRA). Someone has to work on that. We dont think that we need primary keys for Storage (too much overhead) and not for translations. There are some more tables to think about.
Performance
speed performance
- Chuck has the idea: Change the log4j output to be better parseable (creating a csv file).
- Carlos idea: window and view collecting all audit tables (create a window to say "what happened in this time in the system") (JIRA).
- We can add a log of SQL queries (while we are working in logging - thats not about performance but about debugging) (JIRA)
- traffic splitting can be done with pfSense. That allows to route iDempiere traffice over another network connection than users using youtube.
- Performance issues can be in network, apache/nginx proxy or in iDempiere. In Carlos experience in most cases problems in the user experience (complaints: "everything is running slow!") are outside of iDempiere. If iDempiere is the problme in most cases it belongs to bad code.
- If your server runs 100% there are tools to create a java vm menory dump. It can be analysed in eclipse to see what is happening
- You can run the server in debug mode but that hits performance (also when you are not using it). With debug mode Eclipse can connect to the running server and see which code the long running processes are executing
- There is no way to stop a running process. That can be changed if we change code of the processes but that is not easy to do everywhere. Up to now we have no interface and no user interface for that. We can use a template about how to stop a running thread (JIRA)
- Often the problem is the code (mostly self-written plugins): the best is to have short transactions
- If you have a very long running process that uses a transaction you have two possibilities:
- a) you can use a postgres command to increase the transaction timeout. That can make the whole system - especially for other concurrent users or processes - very slow
- b) cut your transaction in smaller pieces. That is the advised solution: Keep your transactions as short as possible.
- idea: schedule background job (JIRA)
- idea: force background based on parameters (JIRA)
- pgbadger is a tool to find out about postgres queries. It analyses the queries form the logs of postgres and helps to find out which queries take more or less time.
- pgtune can help tuning the postgres configuration
- You can solve some issues in complex postgres queries by adding indexes
- If the cpu is not at 100% but the process is stucked you have to think about locks. In most cases locks can be solved by writing the code better.
- examples are ad_sequence and m_storage. If you lock these in a transaction and use a long-running loop you create a bottleneck.
- you can sometimes refactor your code to get document numbers at the end of the process
- In pgAdmin you can see the postgres postmanster process id of a locked process. Carlos used a new column in AD_Changelog with a default value of "pg_backend_pid()". That hits performance but it helps to know which process belongs to which postgres postmasster process.
- idea: document sequences admitting holes (JIRA)
- If a table has a cache it has a ".get(...)" method. That reads That will be faster in speed and use less memory.
memory performance
- a memory issue can arise if you try to use a table/direct field on a big table. This kind of field will read all records to create the pull down list. There is a hard coded limit (of about 200 or so). That creates a message in the log but the user will get not all entries but only a part of it.
- using a search field instead of a table list increases performance a lot
- if you use validations or dynamic validations only the resulting (shorter) list will take memory in the browser. Doing a validation might also helps solving memory issues.
- There is the idea to not allow the user open the very same record in a second window or to make the second window read-only (and write a message to the user like "you openend this record twice")
- there is a plugin from Nicolas that a user can not log into a second time
- idea: restrict number of open windows - in general (JIRA)
Periodical restart
ADempiere needed a lot of restarts to run stable. iDempiere and Java 7 improved that a lot. In principle you can restart the server once a day. An idea is to have a script that restarts the server only if there are no processes running (JIRA).
Better search index
We spoke about using Lucene (that is e.g. used by Solr) as a search index. Norberts idea was to create a special type of search field that uses the Lucene index for search.
To use that search index you have to actualize the index after changes in the database. That does not have to be done synchonous but in another thread or such.
A search like "%key" can not be indexed in PostgreSQL nor Oracle. It creates a full-table search.
Another problem is that big tables need a lot of time to be shown if you forget to give a search key. This does not directly belong to the search index but to a better query. In MRole you can restrict the maximum numbers of records a query can get.
Database scaling
haproxy is a proxy that can load balance the users to many iDempiere application servers. All servers use the same database server. (Chuck Boecking uses haproxy for one and a half year. It is very well documented and for him it works very well.)
The database can be replicated by PostgreSQL to another server.
- The mode of replication can be "immediate". That makes sure that all clients have the same data. That makes commits (according to Carlos' tests) like 3 times slower. A problem in Carlos test is that the replica server shows a transaction as committed when the data is in the walog but not yet in the table. That can break iDempiere processes.
- In the mode "referred" you are not sure that the replicated server has the same data. That is not slower than without replication.
You can use "referred" for backups. You can loose two seconds of data without a performance penalty.
There is a program "pgpool". You can use it as a load balancer for PostgreSQL. Carlos tested it (with PostgreSQL 9.x). You create a postgres Master server and several read-only replicas. All queries go through the pgpool. All queries that change data or function calls that write have to be routed to the master server. Any select without a transaction can be solved in the replica (info windows, reports, no financial reports - they have a big performance hit).
A better idea than using PostgreSQL replication (based on the postgres logs) you can use pgpool replication. That replicates all the tests to all servers. That worked very good but the time was 5 times as long with two servers than with a single database.
A much better idea will be to separate the calls for read and write in the iDempiere code. Walking tree worked on that: http://blogs.walkingtree.in/2013/03/07/seperate-database-for-read-and-write-in-adempiere/ This is not yet in iDempiere (JIRA)
Chuck advises also pgbalancer. That allows to use much more application servers on a single database instance because you do not need memory for every connection. Without that every application server opens like 10 database connections and each of these start a postmaster thread in the postgres database that uses memory.
