MySQL Forums
Forum List  »  PHP

Query Help - Duplicate Records
Posted by: Christopher Wibberley
Date: November 20, 2016 04:11AM

Hi,

I'm struggling to build new functionality into an existing MySQL query.

The purpose of this function is to find duplicate entries in a table. The table is used by a data capture form attached to a Wi-Fi network. When a guest connects, they are required to enter some personal details in before internet access is granted.

Traditionally, the portals have been configured to only capture data the first time a device connects. For subsequent connections, the device is “remembered” and is automatically authorized. For some reason I’ve found there to be a lot of duplicate data – whether that’s from users not giving enough time for the Wi-Fi system to authorize them, or for some other reason. To keep the captured data clean, this query is run. The PHP function to then delete the duplicate rows is below:

$queryDuplicateRows = mysql_query("SELECT COUNT(mac) as cnt, GROUP_CONCAT(id) AS ids FROM $table GROUP BY mac HAVING cnt > 1");



function deleteDuplicateRows(){
global $table;

$queryDuplicateRows = mysql_query("SELECT COUNT(mac) as cnt, GROUP_CONCAT(id) AS ids FROM $table GROUP BY mac HAVING cnt > 1");

while($row = mysql_fetch_array( $queryDuplicateRows )) {
// Find position in string of IDs of the first comma, to denote the end of the first value
$pos = strpos(strval($row[1]), ",");
// Create substring, omitting the first value so that record can be retained
$allIDsOfDuplicateMAC = substr(strval($row[1]), $pos+1);
// Create array from the substring, using the commas as the delimiter
$arrayOfIDsOfDuplicateMAC = explode(",", $allIDsOfDuplicateMAC);
foreach ($arrayOfIDsOfDuplicateMAC as $i){
mysql_query("delete from $table where id=$i");
}
}
}


Some of my customers are now asking for a record of each time a device connects. For a reconnection, the device is still “remembered” (via a search in the database based on the device MAC address), and then duplicate data is then entered into a new row into the table. Because of the query above though, this will be classed as duplicate data. What I want is to modify the function so that it only returns rows that have multiple entries within a single day. For example, below could be an example dataset returned by the above query:

MAC Name Date
AA:BB:CC:DD:EE:FF Chris 11/05/2016
AA:BB:CC:DD:EE:FF Chris 12/05/2016
AA:BB:CC:DD:EE:FF Chris 12/05/2016 ***
AA:BB:CC:DD:EE:FF Chris 13/05/2016
AA:BB:CC:DD:EE:FF Chris 14/05/2016
AA:BB:CC:DD:EE:FF Chris 14/05/2016 ***
AA:BB:CC:DD:EE:FF Chris 14/05/2016 ***
AA:BB:CC:DD:EE:FF Chris 15/05/2016

The rows with stars next to them would be classed as errors since there are multiple entries for that day. The output would then need to be passed to the PHP function in the same way. The data in the "date" field is just VARCHAR, and appears in the same format as the above.

I'm now completely lost with trying to develop this query further. If anybody could provide some help/thoughts I'd really appreciate it.

Kind regards,
Chris.

Options: ReplyQuote


Subject
Written By
Posted
Query Help - Duplicate Records
November 20, 2016 04:11AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.