08 Apr, 2014

1 commit

  • Decouple writing of status messages (change time heartbeats, monitor, and topology msgs) from RS reader threads through the use of a simple event service. It is now the responsibility of the StatusAnalyzer thread to send status messages when notified to do so by the ReplicationServerDomain. In addition, the Monitor*Msgs are no longer routable since they were only ever sent directly between peers. This simplifies some of the request processing in ReplicationServerDomain.
    
    This change does not attempt to solve potential deadlocks arising from transmission of assured replication acks, status changes, generation ID updates, windowing messages (which are deprecated), total update messages, and error messages.
    
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10634 41b1ffd8-f28e-4786-ab96-9950f0a78031
    matthew
     

03 Apr, 2014

1 commit

  • JEChangelogDB.java:
    In ChangelogDBPurger.run(), changed code to:
    - support the absence of change number index DB. To simplify matters, code assumes that ds-cfg-compute-change-number does not change during the life time of an RS.
    - purged by using a CSN made up from the purge delay rather than using the previous cookie.
    - sleep 500 millis if there are no changes to purge, or sleep till the next change to purge.
    - gracefully shutdown without fuss in the logs.
    
    
    JEChangeNumberIndexDB.java
    In purgeUpTo(), return the oldest non purged CSN rather than the previous cookie + merged two branches of the code.
    
    ExternalChangeLogTest.java:
    Consequence of the change to JEChangeNumberIndexDB.
    Extracted method assertECLLimits() from ECLCompatTestLimits() + added a loop inside it to let the code persist changes asynchronously from the test thread.
    
    JEChangeNumberIndexDBTest.java:
    Renamed testTrim() to testPurge().
    In newReplicationServer(), enabled ds-cfg-compute-change-number.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10617 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     

02 Apr, 2014

1 commit

  • 
    After OPENDJ-1174, change number index DB is now populated eagerly (was populated lazily).
    This means we can reap the benefits by changing how purging is done for changelogDB.
    Previous purge process was driven by the replicaDBs: first purge replicaDBs then purge the change number index DB and it was causing lots of problems like stale CNIndexDB records for example.
    New purge process is driven by the change number index DB: first purge change number index DB then purge the replicaDBs based on the oldest valid referenced record in the change number index DB.
    
    Moved JEChangeNumberIndexDB purge thread to JEChangelogDB + made it responsible for purging the ChangeNumberIndexDB and all the ReplicaDBs.
    In JEReplicaDB, thread is now only responsible for flushing, previously it was responsible for trimming and flushing which complicated the design and made it less efficient for both operations.
    
    
    
    JEChangelogDB.java:
    Added inner class ChangelogDBPurger.
    Added fields  purgeDelay, cnPurger and latestTrimDate.
    Extracted method startIndexer()
    Removed getChangeNumberIndexDB(boolean) + associated code.
    Reimplemented getDomainLatestTrimDate() and setPurgeDelay().
    
    JEChangeNumberIndexDB.java:
    No longer implements Runnable + removed run().
    Removed fields trimmingThread, trimAge and replicationServer.
    In ctor, removed ReplicationServer parameter.
    Removed startTrimmingThread(), setPurgeDelay() clear(DN, AtomicBoolean).
    Renamed clear(DN) to removeDomain(DN).
    Renamed trim(AtomicBoolean) to purgeUpTo(long).
    Added purgeUpToCookie(ChangeNumberIndexRecord).
    
    JEReplicaDB.java:
    Removed fields latestTrimDate and trimAge.
    In run(), no longer call trim().
    Removed getLatestTrimDate(), isQueueAboveLowMark(), setPurgeDelay() and getQueueSize().
    Renamed trim() to purgeUpTo(CSN).
    Made flush() private + reduced wait time on polling the msgQueue to speed up shutdown.
    Added getNumberRecords() for unit tests.
    
    JEReplicaDBCursor.java:
    Since flushing is now eager, removed all calls to JEReplicaDB.flush().
    Extracted methods closeCursor().
    
    ReplicationServer.java:
    Renamed getTrimAge() to getPurgeDelay() for consistency.
    
    ReplicationDB.java:
    Added getNumberRecords().
    
    ChangeNumberIndexer.java, ChangeNumberIndexerTest.java:
    Removed hacks due to old purging code.
    
    ExternalChangeLogTest.java:
    Called ChangelogDB.setPurgeDelay() instead of ChangeNumberIndexDB.setPurgeDelay(0).
    Consequence of the changes to JEReplicaDB.
    Removed method setPurgeDelayToInitialValue() that was not doing anything (JEChangeNumberIndexDB was always null).
    In getCNIndexDB(), removed useless null check.
    
    JEChangeNumberIndexDBTest.java:
    Removed constants value1, value2, value3 which are invalid cookies.
    Replaced them with fields previousCookie and cookies.
    Added clearCookie() method.
    In addRecord(), removed cookie parameter and build the cookie from the new fields.
    Consequence of the changes to JEChangeNumberIndexDB.
    
    JEReplicaDBTest.java:
    In waitChangesArePersisted(), used JEReplicaDB.getNumberRecords() instead of JEReplicaDB.getQueueSize() + added parameters describing the number of expected records + the counter record window.
    Replaced all calls to JEReplicaDB.flush() with calls to waitChangesArePersisted().
    Consequence of the changes to JEReplicaDB.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10614 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     

01 Apr, 2014

1 commit


31 Mar, 2014

2 commits

  • Code review: Matthew Swift
    
    Code fix after r10601.
    Several problems shown by continuous integration:
    * Deadlocks,
    * Long waits
    
    JEReplicaDB.java:
    In clear(), calling collectAllPermits() while holding the flushLock actually prevented flushing to happen because it also needed the flushLock. Calling collectAllPermits() does not require to hold the flushLock, and once this method returns the msgQueue should be empty anyway and all the changes should have been pushed to the DB (flush() first removes messages from msgQueue, then add to DB, then releases all permits). In effect, there is no need to synchronize on flushLock anymore.
    In trim(), the check for queue being below low mark was wrong (comparing queue size with number of bytes): extracted isQueueAboveLowMark() and fixed its definition + removed synchronized (flushLock) because it is now deemed unnecessary.
    In generateCursorFrom(), Removed unnecessary call to flush() because that method is also called from JEReplicaDBCursor ctor.
    
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10608 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     
  • …formatted shell commands; OPENDJ-1376: Add <userinput> and potential <computeroutput> to <screen> content
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10607 41b1ffd8-f28e-4786-ab96-9950f0a78031
    mark
     

28 Mar, 2014

2 commits

  • Fixed checkstyle failure.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10603 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     
  • In JEReplicaDB, simplified the logic that handled the internal queue that is used before actually persisting UpdateMsg changes to the underlying Berkeley JE DB.
    Simplified the publisher/consumer model (msgQueue.add() / msgQueue.remove()) by relying on a LinkedBlockingQueue and a semaphore, instead of many synchronized blocks and fields that cluttered this code.
    
    
    
    JEReplicaDB.java:
    Changed msgQueue from LinkedList to LinkedBlockingQueue.
    Removed fields queueMaxSize, queueLowmark, queueHimark, queueLowmarkBytes, queueHimarkBytes, queueByteSize and replaced them all with queueSizeBytes Semaphore.
    Removed clearQueue() and getChanges().
    Added collectAllPermits().
    Added immutable CSNLimits class to remove the need for synchronizing on oldest and newest CSNs.
    
    ReplicationDB.java:
    In addEntries(), now return the total size of the persisted messages (return type was void).
    
    JEReplicaDBTest.java:
    In testTrim(), allowed the test to finish + made the code clearer.
    
    replication.properties:
    Added an error message for adding a change to the JEReplicaDB.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10601 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     

27 Mar, 2014

1 commit


26 Mar, 2014

1 commit


25 Mar, 2014

1 commit


21 Mar, 2014

1 commit


14 Mar, 2014

2 commits


10 Mar, 2014

1 commit


06 Mar, 2014

1 commit


05 Mar, 2014

1 commit


04 Mar, 2014

1 commit


24 Feb, 2014

1 commit


20 Feb, 2014

2 commits


19 Feb, 2014

2 commits

  • 
    Improved design for Replication Topology
    
    
    ReplicationBroker.java + *Test.java:
    Extracted the Topology class to encapsulate the dsList and replicationServerInfos fields + atomically set it via an AtomicReference + moved setLocallyConfiguredFlag(), isSameReplicationServerUrl(), computeConnectedDSs() to Topology class.
    Created methods computeNewTopology() and topologyChange() to compute and set the new topology.
    Removed generationID instance variable duplicated with the one from ReplicationDomain + updated ctor and setGenerationID().
    Improved debugging messages.
    Renamed getDsList() to getReplicaInfos() + changed return type from List<DSInfo> to Map<Integer, DSInfo>.
    Renamed getRsList() to getRsInfos().
    Extracted method toRSInfos().
    
    ReplicationBrokerTest.java: ADDED
    Added to test new ReplicationBroker.Topology class.
    
    
    DSInfo.java:
    Changed equals(Set<String>, Set<String>) to equals(Object, Object).
    In toString(), hid the assured fields if assured replication is off.
    
    RSInfo.java:
    Renamed fields id and serverUrl to rsServerId and rsServerURL.
    In toString(), relied on the compiler to generate the String.
    
    
    TopologyMsg.java:
    Renamed fields dsList and rsList to replicaInfos and rsInfos.
    Renamed getDsList() to getReplicaInfos() + changed return type from List<DSInfo> to Map<Integer, DSInfo>.
    Renamed getRsList() to getRsInfos().
    Extracted methods readStrings() and writeStrings().
    Code cleanup
    Used javadocs
    
    
    ReplicationDomain.java, replication*.properties:
    Extracted class ECLIncludes to replace eclIncludesLock, eclIncludesByServer, eclIncludesAllServers, eclIncludesForDeletesByServer, eclIncludesForDeletesAllServers + moved setEclIncludes() implementation to this class + removed synchronized blocks from all the getters.
    Added baseDN to ERR_INIT_NO_SUCCESS_START_FROM_SERVERS.
    Inlined initializeRemote().
    Removed unused initializeFromRemote().
    Reduced methods visibility.
    Renamed getReplicaList() to getReplicaInfos() + changed return type from List<DSInfo> to Map<Integer, DSInfo>.
    Renamed getRsList() to getRsInfos().
    In initializeRemote(), waitForRemoteEndOfInit(), isRemoteDSConnected() and getProtocolVersion(), simplified code.
    
    *.java:
    Consequence of the changes above.
    Simplified code in a number of places, particularly in the tests.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10405 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     
  • git-svn-id: https://svn.forgerock.org/opendj/trunk@10402 41b1ffd8-f28e-4786-ab96-9950f0a78031
    cjr
     

12 Feb, 2014

1 commit


11 Feb, 2014

1 commit


10 Feb, 2014

3 commits


07 Feb, 2014

1 commit


05 Feb, 2014

1 commit


04 Feb, 2014

1 commit


03 Feb, 2014

2 commits


31 Jan, 2014

2 commits


30 Jan, 2014

1 commit


28 Jan, 2014

3 commits

  • 
    replication*.properties:
    Added exception stacktraces to NOTICE_READER_EXCEPTION_53.
    
    ServerReader.java:
    In run(), logged the stacktrace when calling logError(). Removed redundant call to logException() which logged to debug logger.
    
    
    Session.java:
    Extracted method read().
    
    
    StaticUtils.java:
    In stackTraceToSingleLineString(), when this is not a debug build, added the exception type at the start of the message.
    
    StaticUtilsTest.java: ADDED
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10202 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     
  • Code review: Matthew Swift
    
    Caused by r10049.
    Problem is down to the initialization sequence:
    1. thread 1 - MultimasterReplication.initializeSynchronizationProvider()
    1.1. it creates the ReplicationServerListener
    1.1.1. the ReplicationServerListener in turn creates the ReplicationServer
    1.1.1.1. the ReplicationServer in turn creates the ChangelogDB
    1.1.1.1.1. the ChangelogDB in turn creates the ChangeNumberIndexer thread and STARTs it
    1.1.1.1.1. the ChangelogDB starts the ChangeNumberIndexer thread
    1.2. it proceeds with creating the LDAPReplicationDomain objects one by one
    2. thread 2 - ChangeNumberIndexer.run()
    2.1. it calls ChangeNumberIndexer.initialize()
    2.1.1. ChangeNumberIndexer.initialize() calls MultimasterReplication.isECLEnabledDomain(baseDN)
    
    Steps 1.2. and 2.1.1. are running concurrently.
    If 2.1.1. is run before 1.2. is completed, In ChangeNumberIndexer.initialize():
    1) MultimasterReplication.isECLEnabledDomain(baseDN) returns false, hence a cursor to the relevant replica DBs is not created
    2) then the call to nextChangeForInsertDBCursor.getRecord() returns null, later throwing a NullPointerException because the ChangeNumberIndexer thread is in an illegal state: it was expecting to find an UpdateMsg with the correct CSN stamped on it.
    
    
    
    MultimasterReplication.java:
    Added State enum + state instance member to tell whether MultimasterReplication is ready for work.
    Removed isRegistered instance member superseded by state instance member.
    In isECLEnabledDomain(), completeSynchronizationProvider() and finalizeSynchronizationProvider(), deal with thread waits.
    
    DomainFakeCfg.java:
    Implemented toString().
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10200 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     
  • git-svn-id: https://svn.forgerock.org/opendj/trunk@10196 41b1ffd8-f28e-4786-ab96-9950f0a78031
    ludo
     

25 Jan, 2014

1 commit