pkgsrc-Changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

CVS commit: pkgsrc/databases/mysql-cluster



Module Name:    pkgsrc
Committed By:   jnemeth
Date:           Mon Feb  9 06:46:55 UTC 2015

Modified Files:
        pkgsrc/databases/mysql-cluster: Makefile.common PLIST distinfo

Log Message:
Update to MySQL Cluster 7.3.8:

Changes in MySQL Cluster NDB 7.3.8 (5.6.22-ndb-7.3.8) (2015-01-21)

   MySQL Cluster NDB 7.3.8 is a new release of MySQL Cluster, based on
   MySQL Server 5.6 and including features from version 7.3 of the NDB
   storage engine, as well as fixing a number of recently discovered bugs
   in previous MySQL Cluster releases.

   This release also incorporates all bugfixes and changes made in
   previous MySQL Cluster releases, as well as all bugfixes and feature
   changes which were added in mainline MySQL 5.6 through MySQL 5.6.22
   (see Changes in MySQL 5.6.22 (2014-12-01)).

   Functionality Added or Changed
     * Performance: Recent improvements made to the multithreaded
       scheduler were intended to optimize the cache behavior of its
       internal data structures, with members of these structures placed
       such that those local to a given thread do not overflow into a
       cache line which can be accessed by another thread. Where required,
       extra padding bytes are inserted to isolate cache lines owned (or
       shared) by other threads, thus avoiding invalidation of the entire
       cache line if another thread writes into a cache line not entirely
       owned by itself. This optimization improved MT Scheduler
       performance by several percent.
       It has since been found that the optimization just described
       depends on the global instance of struct thr_repository starting at
       a cache line aligned base address as well as the compiler not
       rearranging or adding extra padding to the scheduler struct; it was
       also found that these prerequisites were not guaranteed (or even
       checked). Thus this cache line optimization has previously worked
       only when g_thr_repository (that is, the global instance) ended up
       being cache line aligned only by accident. In addition, on 64-bit
       platforms, the compiler added extra padding words in struct
       thr_safe_pool such that attempts to pad it to a cache line aligned
       size failed.
       The current fix ensures that g_thr_repository is constructed on a
       cache line aligned address, and the constructors modified so as to
       verify cacheline aligned adresses where these are assumed by
       design.
       Results from internal testing show improvements in MT Scheduler
       read performance of up to 10% in some cases, following these
       changes. (Bug #18352514)
     * Cluster API: Two new example programs, demonstrating reads and
       writes of CHAR, VARCHAR, and VARBINARY column values, have been
       added to storage/ndb/ndbapi-examples in the MySQL Cluster source
       tree. For more information about these programs, including source
       code listings, see NDB API Simple Array Example, and NDB API Simple
       Array Example Using Adapter.

   Bugs Fixed
     * The global checkpoint commit and save protocols can be delayed by
       various causes, including slow disk I/O. The DIH master node
       monitors the progress of both of these protocols, and can enforce a
       maximum lag time during which the protocols are stalled by killing
       the node responsible for the lag when it reaches this maximum. This
       DIH master GCP monitor mechanism did not perform its task more than
       once per master node; that is, it failed to continue monitoring
       after detecting and handling a GCP stop. (Bug #20128256)
       References: See also Bug #19858151.
     * When running mysql_upgrade on a MySQL Cluster SQL node, the
       expected drop of the performance_schema database on this node was
       instead performed on all SQL nodes connected to the cluster. (Bug
       #20032861)
     * A number of problems relating to the fired triggers pool have been
       fixed, including the following issues:
          + When the fired triggers pool was exhausted, NDB returned Error
            218 (Out of LongMessageBuffer). A new error code 221 is added
            to cover this case.
          + An additional, separate case in which Error 218 was wrongly
            reported now returns the correct error.
          + Setting low values for MaxNoOfFiredTriggers led to an error
            when no memory was allocated if there was only one hash
            bucket.
          + An aborted transaction now releases any fired trigger records
            it held. Previously, these records were held until its
            ApiConnectRecord was reused by another transaction.
          + In addition, for the Fired Triggers pool in the internal
            ndbinfo.ndb$pools table, the high value always equalled the
            total, due to the fact that all records were momentarily
            seized when initializing them. Now the high value shows the
            maximum following completion of initialization.
       (Bug #19976428)
     * Online reorganization when using ndbmtd data nodes and with binary
       logging by mysqld enabled could sometimes lead to failures in the
       TRIX and DBLQH kernel blocks, or in silent data corruption. (Bug
       #19903481)
       References: See also Bug #19912988.
     * The local checkpoint scan fragment watchdog and the global
       checkpoint monitor can each exclude a node when it is too slow when
       participating in their respective protocols. This exclusion was
       implemented by simply asking the failing node to shut down, which
       in case this was delayed (for whatever reason) could prolong the
       duration of the GCP or LCP stall for other, unaffected nodes.
       To minimize this time, an isolation mechanism has been added to
       both protocols whereby any other live nodes forcibly disconnect the
       failing node after a predetermined amount of time. This allows the
       failing node the opportunity to shut down gracefully (after logging
       debugging and other information) if possible, but limits the time
       that other nodes must wait for this to occur. Now, once the
       remaining live nodes have processed the disconnection of any
       failing nodes, they can commence failure handling and restart the
       related protocol or protocol, even if the failed node takes an
       excessiviely long time to shut down. (Bug #19858151)
       References: See also Bug #20128256.
     * A watchdog failure resulted from a hang while freeing a disk page
       in TUP_COMMITREQ, due to use of an uninitialized block variable.
       (Bug #19815044, Bug #74380)
     * Multiple threads crashing led to multiple sets of trace files being
       printed and possibly to deadlocks. (Bug #19724313)
     * When a client retried against a new master a schema transaction
       that failed previously against the previous master while the latter
       was restarting, the lock obtained by this transaction on the new
       master prevented the previous master from progressing past start
       phase 3 until the client was terminated, and resources held by it
       were cleaned up. (Bug #19712569, Bug #74154)
     * When using the NDB storage engine, the maximum possible length of a
       database or table name is 63 characters, but this limit was not
       always strictly enforced. This meant that a statement using a name
       having 64 characters such CREATE DATABASE, DROP DATABASE, or ALTER
       TABLE RENAME could cause the SQL node on which it was executed to
       fail. Now such statements fail with an appropriate error message.
       (Bug #19550973)
     * When a new data node started, API nodes were allowed to attempt to
       register themselves with the data node for executing transactions
       before the data node was ready. This forced the API node to wait an
       extra heartbeat interval before trying again.
       To address this issue, a number of HA_ERR_NO_CONNECTION errors
       (Error 4009) that could be issued during this time have been
       changed to Cluster temporarily unavailable errors (Error 4035),
       which should allow API nodes to use new data nodes more quickly
       than before. As part of this fix, some errors which were
       incorrectly categorised have been moved into the correct
       categories, and some errors which are no longer used have been
       removed. (Bug #19524096, Bug #73758)
     * When executing very large pushdown joins involving one or more
       indexes each defined over several columns, it was possible in some
       cases for the DBSPJ block (see The DBSPJ Block) in the NDB kernel
       to generate SCAN_FRAGREQ signals that were excessively large. This
       caused data nodes to fail when these could not be handled
       correctly, due to a hard limit in the kernel on the size of such
       signals (32K). This fix bypasses that limitation by breaking up
       SCAN_FRAGREQ data that is too large for one such signal, and
       sending the SCAN_FRAGREQ as a chunked or fragmented signal instead.
       (Bug #19390895)
     * ndb_index_stat sometimes failed when used against a table
       containing unique indexes. (Bug #18715165)
     * Queries against tables containing a CHAR(0) columns failed with
       ERROR 1296 (HY000): Got error 4547 'RecordSpecification has
       overlapping offsets' from NDBCLUSTER. (Bug #14798022)
     * In the NDB kernel, it was possible for a TransporterFacade object
       to reset a buffer while the data contained by the buffer was being
       sent, which could lead to a race condition. (Bug #75041, Bug
       #20112981)
     * mysql_upgrade failed to drop and recreate the ndbinfo database and
       its tables as expected. (Bug #74863, Bug #20031425)
     * Due to a lack of memory barriers, MySQL Cluster programs such as
       ndbmtd did not compile on POWER platforms. (Bug #74782, Bug
       #20007248)
     * In some cases, when run against a table having an AFTER DELETE
       trigger, a DELETE statement that matched no rows still caused the
       trigger to execute. (Bug #74751, Bug #19992856)
     * A basic requirement of the NDB storage engine's design is that the
       transporter registry not attempt to receive data
       (TransporterRegistry::performReceive()) from and update the
       connection status (TransporterRegistry::update_connections()) of
       the same set of transporters concurrently, due to the fact that the
       updates perform final cleanup and reinitialization of buffers used
       when receiving data. Changing the contents of these buffers while
       reading or writing to them could lead to "garbage" or inconsistent
       signals being read or written.
       During the course of work done previously to improve the
       implementation of the transporter facade, a mutex intended to
       protect against the concurrent use of the performReceive() and
       update_connections()) methods on the same transporter was
       inadvertently removed. This fix adds a watchdog check for
       concurrent usage. In addition, update_connections() and
       performReceive() calls are now serialized together while polling
       the transporters. (Bug #74011, Bug #19661543)
     * ndb_restore failed while restoring a table which contained both a
       built-in conversion on the primary key and a staging conversion on
       a TEXT column.
       During staging, a BLOB table is created with a primary key column
       of the target type. However, a conversion function was not provided
       to convert the primary key values before loading them into the
       staging blob table, which resulted in corrupted primary key values
       in the staging BLOB table. While moving data from the staging table
       to the target table, the BLOB read failed because it could not find
       the primary key in the BLOB table.
       Now all BLOB tables are checked to see whether there are
       conversions on primary keys of their main tables. This check is
       done after all the main tables are processed, so that conversion
       functions and parameters have already been set for the main tables.
       Any conversion functions and parameters used for the primary key in
       the main table are now duplicated in the BLOB table. (Bug #73966,
       Bug #19642978)
     * Corrupted messages to data nodes sometimes went undetected, causing
       a bad signal to be delivered to a block which aborted the data
       node. This failure in combination with disconnecting nodes could in
       turn cause the entire cluster to shut down.
       To keep this from happening, additional checks are now made when
       unpacking signals received over TCP, including checks for byte
       order, compression flag (which must not be used), and the length of
       the next message in the receive buffer (if there is one).
       Whenever two consecutive unpacked messages fail the checks just
       described, the current message is assumed to be corrupted. In this
       case, the transporter is marked as having bad data and no more
       unpacking of messages occurs until the transporter is reconnected.
       In addition, an entry is written to the cluster log containing the
       error as well as a hex dump of the corrupted message. (Bug #73843,
       Bug #19582925)
     * Transporter send buffers were not updated properly following a
       failed send. (Bug #45043, Bug #20113145)
     * ndb_restore --print_data truncated TEXT and BLOB column values to
       240 bytes rather than 256 bytes.
     * Disk Data: An update on many rows of a large Disk Data table could
       in some rare cases lead to node failure. In the event that such
       problems are observed with very large transactions on Disk Data
       tables you can now increase the number of page entries allocated
       for disk page buffer memory by raising the value of the
       DiskPageBufferEntries data node configuration parameter added in
       this release. (Bug #19958804)
     * Disk Data: When a node acting as a DICT master fails, the
       arbitrator selects another node to take over in place of the failed
       node. During the takeover procedure, which includes cleaning up any
       schema transactions which are still open when the master failed,
       the disposition of the uncommitted schema transaction is decided.
       Normally this transaction be rolled back, but if it has completed a
       sufficient portion of a commit request, the new master finishes
       processing the commit. Until the fate of the transaction has been
       decided, no new TRANS_END_REQ messages from clients can be
       processed. In addition, since multiple concurrent schema
       transactions are not supported, takeover cleanup must be completed
       before any new transactions can be started.
       A similar restriction applies to any schema operations which are
       performed in the scope of an open schema transaction. The counter
       used to coordinate schema operation across all nodes is employed
       both during takeover processing and when executing any non-local
       schema operations. This means that starting a schema operation
       while its schema transaction is in the takeover phase causes this
       counter to be overwritten by concurrent uses, with unpredictable
       results.
       The scenarios just described were handled previously using a
       pseudo-random delay when recovering from a node failure. Now we
       check before the new master has rolled forward or backwards any
       schema transactions remaining after the failure of the previous
       master and avoid starting new schema transactions or performing
       operations using old transactions until takeover processing has
       cleaned up after the abandoned transaction. (Bug #19874809, Bug
       #74503)
     * Disk Data: When a node acting as DICT master fails, it is still
       possible to request that any open schema transaction be either
       committed or aborted by sending this request to the new DICT
       master. In this event, the new master takes over the schema
       transaction and reports back on whether the commit or abort request
       succeeded. In certain cases, it was possible for the new master to
       be misidentified--that is, the request was sent to the wrong node,
       which responded with an error that was interpreted by the client
       application as an aborted schema transaction, even in cases where
       the transaction could have been successfully committed, had the
       correct node been contacted. (Bug #74521, Bug #19880747)
     * Cluster Replication: When an NDB client thread made a request to
       flush the binary log using statements such as FLUSH BINARY LOGS or
       SHOW BINLOG EVENTS, this caused not only the most recent changes
       made by this client to be flushed, but all recent changes made by
       all other clients to be flushed as well, even though this was not
       needed. This behavior caused unnecessary waiting for the statement
       to execute, which could lead to timeouts and other issues with
       replication. Now such statements flush the most recent database
       changes made by the requesting thread only.
       As part of this fix, the status variables
       Ndb_last_commit_epoch_server, Ndb_last_commit_epoch_session, and
       Ndb_slave_max_replicated_epoch, originally implemented in MySQL
       Cluster NDB 7.4, are also now available in MySQL Cluster NDB 7.3.
       For descriptions of these variables, see MySQL Cluster Status
       Variables; for further information, see MySQL Cluster Replication
       Conflict Resolution. (Bug #19793475)
     * Cluster Replication: It was possible using wildcards to set up
       conflict resolution for an exceptions table (that is, a table named
       using the suffix $EX), which should not be allowed. Now when a
       replication conflict function is defined using wildcard
       expressions, these are checked for possible matches so that, in the
       event that the function would cover an exceptions table, it is not
       set up for this table. (Bug #19267720)
     * Cluster API: It was possible to delete an Ndb_cluster_connection
       object while there remained instances of Ndb using references to
       it. Now the Ndb_cluster_connection destructor waits for all related
       Ndb objects to be released before completing. (Bug #19999242)
       References: See also Bug #19846392.
     * Cluster API: The buffer allocated by an NdbScanOperation for
       receiving scanned rows was not released until the NdbTransaction
       owning the scan operation was closed. This could lead to excessive
       memory usage in an application where multiple scans were created
       within the same transaction, even if these scans were closed at the
       end of their lifecycle, unless NdbScanOperation::close() was
       invoked with the releaseOp argument equal to true. Now the buffer
       is released whenever the cursor navigating the result set is closed
       with NdbScanOperation::close(), regardless of the value of this
       argument. (Bug #75128, Bug #20166585)
     * ClusterJ: The following errors were logged at the SEVERE level;
       they are now logged at the NORMAL level, as they should be:
          + Duplicate primary key
          + Duplicate unique key
          + Foreign key constraint error: key does not exist
          + Foreign key constraint error: key exists
       (Bug #20045455)
     * ClusterJ: The com.mysql.clusterj.tie class gave off a logging
       message at the INFO logging level for every single query, which was
       unnecessary and was affecting the performance of applications that
       used ClusterJ. (Bug #20017292)
     * ClusterJ: ClusterJ reported a segmentation violation when an
       application closed a session factory while some sessions were still
       active. This was because MySQL Cluster allowed an
       Ndb_cluster_connection object be to deleted while some Ndb
       instances were still active, which might result in the usage of
       null pointers by ClusterJ. This fix stops that happening by
       preventing ClusterJ from closing a session factory when any of its
       sessions are still active. (Bug #19846392)
       References: See also Bug #19999242.


To generate a diff of this commit:
cvs rdiff -u -r1.2 -r1.3 pkgsrc/databases/mysql-cluster/Makefile.common \
    pkgsrc/databases/mysql-cluster/PLIST \
    pkgsrc/databases/mysql-cluster/distinfo

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.




Home | Main Index | Thread Index | Old Index