24 Apr, 2014

1 commit


23 Apr, 2014

2 commits

  • Preliminary work, also needed by Nicolas for OPENDJ-1177.
    
    
    ByteSequenceReader.java:
    Added peek() and peek(int offset).
    
    ByteSequenceReaderTest.java:
    Added a test.
    Added method b() to ease reading in int -> byte conversion.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10670 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     
  • It seems unnecessary to have a msgQueue on top of JE since JE's already has a builtin cache that handles the same responsibility.
    This improvement removes the JEReplicaDB.msgQueue and the associated flushing thread to save on memory and resources.
    
    Code cleanup in JEChangeNumberIndexDB after CR-3388.
    
    
    JEReplicaDB.java:
    Does not implement Runnable anymore.
    Removed fields msgQueue, queueSizeBytes, queueMaxBytes, thread, flushLock.
    Added and used shutdown field to compensate for removing the thread field.
    Removed methods collectAllPermits(), flush(), run() and stop().
    In shutdown(), used AtomicBoolean.compareAndSet().
    
    ReplicationDB.java:
    Renamed addEntries(List<UpdateMsg>) to addEntry(UpdateMsg).
    
    JEReplicaDBTest.java:
    Removed now unnecessary waitChangesArePersisted().
    
    
    JEChangelogDB.java:
    In shutdown(), enforced threads joining + called Thread.interrupt() to ensure shutdown. This prevents message about unclosed cursors in integrated unit tests.
    
    ReplicationServer.java:
    Removed getQueueSize().
    
    
    JEChangeNumberIndexDB.java:
    Removed unused oldestChangeNumber.
    In shutdown(), used AtomicBoolean.compareAndSet() + removed useless call to notify().
    
    JEChangeNumberIndexDBTest.java:
    Fixed javadocs.
    
    
    ChangeTimeHeartbeatMsg.java:
    Implemented toString().
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10669 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     

18 Apr, 2014

1 commit

  • The bug was due to a very complex interaction between various components. Here is a scenario and explanation:
    1- the change number indexer has no more records to proceed, because all cursors are exhausted, so it calls wait().
    2- a new change Upd1 comes in for an exhausted cursor, medium consistency cannot move.
    3- a new change Upd2 comes in for a cursor that is not already opened, medium consistency can move, so wake up the change number indexer.
    3- on wake up, the change number indexer calls next(), advancing the CompositeDBCursor, which recycles the exhausted cursor, then calls next() on it, making it lose its change. CompositeDBCursor currentRecord == Upd1.
    4- on the next iteration of the loop in run(), a new cursor is created, triggering the creation of a new CompositeDBCursor => Upd1 is lost. CompositeDBCursor currentRecord == Upd2.
    
    The problem comes from two parts:
    - CompositeDBCursor consumes next change from a cursor (which completely forget about this change) and stores it itself
    - ChangeNumberIndexer manages recycling/creating cursors on its own and recreates CompositeDBCursor when a new cursor is created.
    
    The fix required:
    - Preventing CompositeDBCursor from consuming changes from underlying cursors until it can forget about this same change itself.
    - Ensuring only ChangeNumberIndexer handle recycling the cursors it owns instead of having both CompositeDBCursor and ChangeNumberIndexer trying to do it. It is also more performant to let ChangeNumberIndexer manage its cursors.
    
    
    
    CompositeDBCursor.java:
    Added recycleExhaustedCursors field to tell the composite whether it can recycle the cursors itself or not (recycling the cursors is currently needed for persistent searches on the changelog, maybe will we be able to remove it in the future, that would simplify the code a lot).
    Modified the ctor to pass in value of recycleExhaustedCursors.
    Removed currentRecord and currentData fields, replaced by reading the record and field on the first entry in the cursors SortedMap.
    Added state field to ensure the first call to next() does not consume the first change in the cursors SortedMap.
    
    ChangeNumberIndexer.java:
    ChangeNumberIndexer now manages alone the cursors recycling and creation and recreates the CompositeDBCursor when needed.
    In run(), removed the now unneeded call to next() after the wait.
    Added recycleExhaustedCursors().
    
    JEChangelogDB.java:
    Consequence of the change to CompositeDBCursor. Kept old recycling behaviour.
    
    
    ChangeNumberIndexerTest.java:
    Added emptyDBTwoDSsDoesNotLoseChanges() to cover the case being fixed by current commit.
    Renamed test methods dropping the "Initial" when it was not adding much to the test comprehension.
    
    CompositeDBCursorTest.java:
    In newUpdateMsg(), added toString() implementation to help debug.
    Removed recycleTwoElementCursorsTODOJNR().
    In recycleTwoElementCursors(), changed the tests a bit to match the changes to CompositeDBCursor.
    
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10667 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     

17 Apr, 2014

2 commits

  • …beat from known replicas
    
    On startup, if a replication server knows about several replicas, then it must wait to receive some sort of alive information for each of them before being able to move the medium consistency forward.
    Changes or heartbeats received after replication server started are acceptable.
    Likewise, changes that would have been received before the replication server stopped are also acceptable.
    
    This was fixed on replication server startup, by initializing the lastAliveCSN for each known replica, with the oldest possible CSN (timestamp == 0).
    Then when checking if the medium consistency can move forward, if no medium consistency is set, then the lastAliveCSN for each known replica must have a timestamp != 0.
    
    
    ChangeNumberIndexer.java:
    In canMoveForwardMediumConsistencyPoint(), call allInitialReplicasArePastOldestPossibleCSN() if the medium consistency CSN is not set.
    Added methods oldestPossibleCSN(), allInitialReplicasAreAlive().
    
    ChangeNumberIndexerTest.java:
    In emptyDBTwoInitialDSs(), slightly modified the code to test current bug.
    In startCNIndexer(), added the initial ECL enabled domains as a parameter.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10666 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     
  • The changeNumber progression was blocked when the mediumConsistencyCSN was from a different baseDN than the new change to add to the changeNumber index DB.
    Fixed this problem by also storing the baseDN of the mediumConsistencyCSN in the ChangeNumberIndexer class.
    
    
    ChangeNumberIndexer.java:
    Renamed mediumConsistencyCSN to mediumConsistency and changed its type from CSN to Pair<DN, CSN>.
    In tryNotify(), removed the now useless baseDN parameter (replaced with mediumConsistency field).
    Improved comments.
    
    ChangeNumberIndexerTest.java:
    Renamed BASE_DN to BASE_DN1.
    Added BASE_DN2 and test emptyDBTwoInitialDSsDifferentDomains().
    In assertExternalChangelogContent(), did some renaming.
    
    SequentialDBCursor.java:
    Improved toString().
    
    ChangeNumberIndexDB.java:
    Removed obsolete FIXME.
    
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10665 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     

15 Apr, 2014

1 commit

  • Small enhancements / cleanups.
    
    
    ChangeNumberIndexer.java:
    Always updated lastAliveCSNs as the very last thing because it is used to decide whether the medium consistency point can move forward.
    
    
    AddMsg.java:
    In toString(), renamed the incorrectly named "changeNumber" attribute to "csn" + merged V1 and V2 protocol paths.
    Extracted method encodeAttributes().
    javadoc cleanup.
    
    DeleteMsg.java, ModifyDNMsg.java, ModifyMsg.java:
    In toString(), renamed the incorrectly named "changeNumber" attribute to "csn" + merged V1 and V2 protocol paths.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10662 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     

14 Apr, 2014

1 commit

  • Code cleanup in ECLServerHandler:
    The eligibleCSN was used when the ECLServerHandler was responsible for inserting replicaDB changes into the changeNumber index DB.
    To this end, the eligibleCSN was computed to provide a kind of "medium consistency point" ("kind of" because it was badly computed).
    After OPENDJ-1174, it is no longer ECLServerHandler's responsibility to inserting replicaDB changes into the changeNumber index DB, so the eligibleCSN is now useless and this commit removes it.
    
    In addition this commit enhances the toString() methods for a better readability.
    
    
    
    ECLServerHandler.java:
    Removed eligibleCSN field + refreshEligibleCSN().
    Renamed computeNextEligibleMessageForDomain() to computeNextAvailableMessage() + simplified code.
    In DomainContext, removed nextNonEligibleMsg field + removed isEligible(), debugInfo() and toString(CSN).
    In dumpState(), made the debug string more readable.
    In buildDomainContexts(), used DomainContext new ctor + extracted method newDomainContext() to build full DomainContext objects in one go + fixed a possible ConcurrentModificationException.
    In DomainContext, made some fields final + added a ctor + made toString() more readable.
    
    ReplicationServerDomain.java:
    Removed getEligibleCSN() and isServerConnected(), now unused.
    
    ReplicationServer.java:
    Removed getEligibleCSN(), now unused.
    
    ReplicationDomainDB.java, JEChangelogDB.java, ChangeNumberIndexer.java:
    Removed getDomainLastAliveCSNs(), now unused.
    
    ExternalChangeLogTest.java
    Consequence of the changes above.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10655 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     

11 Apr, 2014

1 commit


10 Apr, 2014

2 commits


08 Apr, 2014

1 commit

  • Decouple writing of status messages (change time heartbeats, monitor, and topology msgs) from RS reader threads through the use of a simple event service. It is now the responsibility of the StatusAnalyzer thread to send status messages when notified to do so by the ReplicationServerDomain. In addition, the Monitor*Msgs are no longer routable since they were only ever sent directly between peers. This simplifies some of the request processing in ReplicationServerDomain.
    
    This change does not attempt to solve potential deadlocks arising from transmission of assured replication acks, status changes, generation ID updates, windowing messages (which are deprecated), total update messages, and error messages.
    
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10634 41b1ffd8-f28e-4786-ab96-9950f0a78031
    matthew
     

03 Apr, 2014

1 commit

  • JEChangelogDB.java:
    In ChangelogDBPurger.run(), changed code to:
    - support the absence of change number index DB. To simplify matters, code assumes that ds-cfg-compute-change-number does not change during the life time of an RS.
    - purged by using a CSN made up from the purge delay rather than using the previous cookie.
    - sleep 500 millis if there are no changes to purge, or sleep till the next change to purge.
    - gracefully shutdown without fuss in the logs.
    
    
    JEChangeNumberIndexDB.java
    In purgeUpTo(), return the oldest non purged CSN rather than the previous cookie + merged two branches of the code.
    
    ExternalChangeLogTest.java:
    Consequence of the change to JEChangeNumberIndexDB.
    Extracted method assertECLLimits() from ECLCompatTestLimits() + added a loop inside it to let the code persist changes asynchronously from the test thread.
    
    JEChangeNumberIndexDBTest.java:
    Renamed testTrim() to testPurge().
    In newReplicationServer(), enabled ds-cfg-compute-change-number.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10617 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     

02 Apr, 2014

1 commit

  • 
    After OPENDJ-1174, change number index DB is now populated eagerly (was populated lazily).
    This means we can reap the benefits by changing how purging is done for changelogDB.
    Previous purge process was driven by the replicaDBs: first purge replicaDBs then purge the change number index DB and it was causing lots of problems like stale CNIndexDB records for example.
    New purge process is driven by the change number index DB: first purge change number index DB then purge the replicaDBs based on the oldest valid referenced record in the change number index DB.
    
    Moved JEChangeNumberIndexDB purge thread to JEChangelogDB + made it responsible for purging the ChangeNumberIndexDB and all the ReplicaDBs.
    In JEReplicaDB, thread is now only responsible for flushing, previously it was responsible for trimming and flushing which complicated the design and made it less efficient for both operations.
    
    
    
    JEChangelogDB.java:
    Added inner class ChangelogDBPurger.
    Added fields  purgeDelay, cnPurger and latestTrimDate.
    Extracted method startIndexer()
    Removed getChangeNumberIndexDB(boolean) + associated code.
    Reimplemented getDomainLatestTrimDate() and setPurgeDelay().
    
    JEChangeNumberIndexDB.java:
    No longer implements Runnable + removed run().
    Removed fields trimmingThread, trimAge and replicationServer.
    In ctor, removed ReplicationServer parameter.
    Removed startTrimmingThread(), setPurgeDelay() clear(DN, AtomicBoolean).
    Renamed clear(DN) to removeDomain(DN).
    Renamed trim(AtomicBoolean) to purgeUpTo(long).
    Added purgeUpToCookie(ChangeNumberIndexRecord).
    
    JEReplicaDB.java:
    Removed fields latestTrimDate and trimAge.
    In run(), no longer call trim().
    Removed getLatestTrimDate(), isQueueAboveLowMark(), setPurgeDelay() and getQueueSize().
    Renamed trim() to purgeUpTo(CSN).
    Made flush() private + reduced wait time on polling the msgQueue to speed up shutdown.
    Added getNumberRecords() for unit tests.
    
    JEReplicaDBCursor.java:
    Since flushing is now eager, removed all calls to JEReplicaDB.flush().
    Extracted methods closeCursor().
    
    ReplicationServer.java:
    Renamed getTrimAge() to getPurgeDelay() for consistency.
    
    ReplicationDB.java:
    Added getNumberRecords().
    
    ChangeNumberIndexer.java, ChangeNumberIndexerTest.java:
    Removed hacks due to old purging code.
    
    ExternalChangeLogTest.java:
    Called ChangelogDB.setPurgeDelay() instead of ChangeNumberIndexDB.setPurgeDelay(0).
    Consequence of the changes to JEReplicaDB.
    Removed method setPurgeDelayToInitialValue() that was not doing anything (JEChangeNumberIndexDB was always null).
    In getCNIndexDB(), removed useless null check.
    
    JEChangeNumberIndexDBTest.java:
    Removed constants value1, value2, value3 which are invalid cookies.
    Replaced them with fields previousCookie and cookies.
    Added clearCookie() method.
    In addRecord(), removed cookie parameter and build the cookie from the new fields.
    Consequence of the changes to JEChangeNumberIndexDB.
    
    JEReplicaDBTest.java:
    In waitChangesArePersisted(), used JEReplicaDB.getNumberRecords() instead of JEReplicaDB.getQueueSize() + added parameters describing the number of expected records + the counter record window.
    Replaced all calls to JEReplicaDB.flush() with calls to waitChangesArePersisted().
    Consequence of the changes to JEReplicaDB.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10614 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     

01 Apr, 2014

1 commit


31 Mar, 2014

2 commits

  • Code review: Matthew Swift
    
    Code fix after r10601.
    Several problems shown by continuous integration:
    * Deadlocks,
    * Long waits
    
    JEReplicaDB.java:
    In clear(), calling collectAllPermits() while holding the flushLock actually prevented flushing to happen because it also needed the flushLock. Calling collectAllPermits() does not require to hold the flushLock, and once this method returns the msgQueue should be empty anyway and all the changes should have been pushed to the DB (flush() first removes messages from msgQueue, then add to DB, then releases all permits). In effect, there is no need to synchronize on flushLock anymore.
    In trim(), the check for queue being below low mark was wrong (comparing queue size with number of bytes): extracted isQueueAboveLowMark() and fixed its definition + removed synchronized (flushLock) because it is now deemed unnecessary.
    In generateCursorFrom(), Removed unnecessary call to flush() because that method is also called from JEReplicaDBCursor ctor.
    
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10608 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     
  • …formatted shell commands; OPENDJ-1376: Add <userinput> and potential <computeroutput> to <screen> content
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10607 41b1ffd8-f28e-4786-ab96-9950f0a78031
    mark
     

28 Mar, 2014

2 commits

  • Fixed checkstyle failure.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10603 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     
  • In JEReplicaDB, simplified the logic that handled the internal queue that is used before actually persisting UpdateMsg changes to the underlying Berkeley JE DB.
    Simplified the publisher/consumer model (msgQueue.add() / msgQueue.remove()) by relying on a LinkedBlockingQueue and a semaphore, instead of many synchronized blocks and fields that cluttered this code.
    
    
    
    JEReplicaDB.java:
    Changed msgQueue from LinkedList to LinkedBlockingQueue.
    Removed fields queueMaxSize, queueLowmark, queueHimark, queueLowmarkBytes, queueHimarkBytes, queueByteSize and replaced them all with queueSizeBytes Semaphore.
    Removed clearQueue() and getChanges().
    Added collectAllPermits().
    Added immutable CSNLimits class to remove the need for synchronizing on oldest and newest CSNs.
    
    ReplicationDB.java:
    In addEntries(), now return the total size of the persisted messages (return type was void).
    
    JEReplicaDBTest.java:
    In testTrim(), allowed the test to finish + made the code clearer.
    
    replication.properties:
    Added an error message for adding a change to the JEReplicaDB.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10601 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     

27 Mar, 2014

1 commit


26 Mar, 2014

1 commit


25 Mar, 2014

1 commit


21 Mar, 2014

1 commit


14 Mar, 2014

2 commits


10 Mar, 2014

1 commit


06 Mar, 2014

1 commit


05 Mar, 2014

1 commit


04 Mar, 2014

1 commit


24 Feb, 2014

1 commit


20 Feb, 2014

2 commits


19 Feb, 2014

2 commits

  • 
    Improved design for Replication Topology
    
    
    ReplicationBroker.java + *Test.java:
    Extracted the Topology class to encapsulate the dsList and replicationServerInfos fields + atomically set it via an AtomicReference + moved setLocallyConfiguredFlag(), isSameReplicationServerUrl(), computeConnectedDSs() to Topology class.
    Created methods computeNewTopology() and topologyChange() to compute and set the new topology.
    Removed generationID instance variable duplicated with the one from ReplicationDomain + updated ctor and setGenerationID().
    Improved debugging messages.
    Renamed getDsList() to getReplicaInfos() + changed return type from List<DSInfo> to Map<Integer, DSInfo>.
    Renamed getRsList() to getRsInfos().
    Extracted method toRSInfos().
    
    ReplicationBrokerTest.java: ADDED
    Added to test new ReplicationBroker.Topology class.
    
    
    DSInfo.java:
    Changed equals(Set<String>, Set<String>) to equals(Object, Object).
    In toString(), hid the assured fields if assured replication is off.
    
    RSInfo.java:
    Renamed fields id and serverUrl to rsServerId and rsServerURL.
    In toString(), relied on the compiler to generate the String.
    
    
    TopologyMsg.java:
    Renamed fields dsList and rsList to replicaInfos and rsInfos.
    Renamed getDsList() to getReplicaInfos() + changed return type from List<DSInfo> to Map<Integer, DSInfo>.
    Renamed getRsList() to getRsInfos().
    Extracted methods readStrings() and writeStrings().
    Code cleanup
    Used javadocs
    
    
    ReplicationDomain.java, replication*.properties:
    Extracted class ECLIncludes to replace eclIncludesLock, eclIncludesByServer, eclIncludesAllServers, eclIncludesForDeletesByServer, eclIncludesForDeletesAllServers + moved setEclIncludes() implementation to this class + removed synchronized blocks from all the getters.
    Added baseDN to ERR_INIT_NO_SUCCESS_START_FROM_SERVERS.
    Inlined initializeRemote().
    Removed unused initializeFromRemote().
    Reduced methods visibility.
    Renamed getReplicaList() to getReplicaInfos() + changed return type from List<DSInfo> to Map<Integer, DSInfo>.
    Renamed getRsList() to getRsInfos().
    In initializeRemote(), waitForRemoteEndOfInit(), isRemoteDSConnected() and getProtocolVersion(), simplified code.
    
    *.java:
    Consequence of the changes above.
    Simplified code in a number of places, particularly in the tests.
    
    git-svn-id: https://svn.forgerock.org/opendj/trunk@10405 41b1ffd8-f28e-4786-ab96-9950f0a78031
    JnRouvignac
     
  • git-svn-id: https://svn.forgerock.org/opendj/trunk@10402 41b1ffd8-f28e-4786-ab96-9950f0a78031
    cjr
     

12 Feb, 2014

1 commit


11 Feb, 2014

1 commit


10 Feb, 2014

3 commits


07 Feb, 2014

1 commit